Feeds

Mastering Regular Expressions

Not just a text book – a real guide

  • alert
  • submit to reddit

Providing a secure and efficient Helpdesk

Book review Like SQL and XML, regular expressions are an essential tool in every developers’ toolbox. Processing text, which is pretty much what most programs do when you think about it, is so central a concern that even without regular expressions most developers quickly build up a library of functions and idioms for text matching, replacement, parsing, token extraction etc.

Regular expressions go a whole lot further and are, in effect, a specialised language for text processing in all its messy glory. The downside to this extra power is that it comes at a price – regular expressions have a reputation for being difficult to craft, difficult to debug and difficult to read. No wonder that regular expressions are often seen as the preserve of obscurantist hackers who make a virtue of impossible-to-decipher Perl one-liners.

Jeffrey Friedl’s Mastering Regular Expressions, now into a thirdedition, is something of a classic of its kind. The aim is simple enough – to make regular expressions your friend. No matter what your platform or programming language of choice, Friedl’s book is designed not just to get you started, but to set you well on the way to expertise.

The key to achieving this mastery is to provide plenty of examples throughout the entire 500+ pages of text. From the word go the emphasis is on real examples, building each one up in complexity to match the kind of problems that occur with real text rather than the sanitised kind that only seems to exist in text books. Want to strip out IP addresses or URLs? What might appear to be a straightforward case ends up being tricky and complicated – but along the way Freidl points out the kinds of problems that will bite you if you’re not careful. He also repeatedly makes the point that there’s a trade-off between completeness, performance and the state of the data you’re dealing with.

The first six chapters are devoted to general topics – providing plenty of extended examples, discussion of different flavours of regular expression engines, some history, coverage of efficiency issues and so on. These six chapters include material on regular expression support in Perl, Java, Python, Ruby, C# and VB.NET, PHP as well as tools like egrep, awk and so on.

The final four chapters of the book are devoted to regular expression support in specific languages and platforms: Perl, Java, .NET and PHP. In each case there is a combination of more worked examples and detailed discussion based on the material covered in the first six chapters. For developers using any of these languages this material is a bonus that consolidates the core understanding that Friedl establishes in the earlier sections of the book.

As well as the clear writing style, the book also scores well in terms of overall design. The typography stands out in terms of clarity when displaying regular expressions and the text that it matches – it’s always clear what’s what, with the different components clearly marked out. Another good design feature is that the questions Friedl poses the reader always have a solution on the next page – you don’t ever see the answer before you’ve had a chance to ponder the question, and neither do you have to flick to the end of the book. It makes answering the questions a more natural part of the reading process.

Mastering Regular Expressions

Verdict: Highly recommended to anyone wanting to really get to grips with regular expressions.

Author: Jeffrey Friedl

Publisher: O’Reilly & Associates

ISBN:0596528124

Media: Book

List Price: £31.99

Buy this book at Register Books at Reg Developer's special discounted price!

Choosing a cloud hosting partner with confidence

More from The Register

next story
Microsoft on the Threshold of a new name for Windows next week
Rebranded OS reportedly set to be flung open by Redmond
Business is back, baby! Hasta la VISTA, Win 8... Oh, yeah, Windows 9
Forget touchscreen millennials, Microsoft goes for mouse crowd
SMASH the Bash bug! Apple and Red Hat scramble for patch batches
'Applying multiple security updates is extremely difficult'
Apple: SO sorry for the iOS 8.0.1 UPDATE BUNGLE HORROR
Apple kills 'upgrade'. Hey, Microsoft. You sure you want to be like these guys?
ARM gives Internet of Things a piece of its mind – the Cortex-M7
32-bit core packs some DSP for VIP IoT CPU LOL
Lotus Notes inventor Ozzie invents app to talk to people on your phone
Imagine that. Startup floats with voice collab app for Win iPhone
prev story

Whitepapers

A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.