Chapter 7. Pattern Matching

This chapter treats an enormously important topic and the Python module that supports it. We’ll use its features frequently in examples later in the book. As you gain familiarity with them, you’ll be using them increasingly more often. Some patience is required, though: the topic is quite technical, and you can’t absorb it all at once. You’ll need to come back to this periodically and explore it further. Be ambitious in experimenting with it. This is the only chapter in the book that’s like this.

The mysterious topic is string pattern matching using regular expressions. The term “pattern matching” should be enough of a hint for you to realize how important this topic is for working with bioinformatics data. It’s also important for processing web pages and for many other kinds of work a bioinformatics programmer needs to do. Among other things, restriction enzyme binding sites can be expressed as regular expressions. A carefully constructed regular expression can take the place of many lines of string manipulations, loops, and iterations. Once you’ve learned to use regular expressions, you’ll have a very powerful tool in your hand.

Note

If you have never encountered regular expressions before—or even if you’ve used only their most basic features—you should be aware that the topic is rather large, and learning it is not easy. More than most aspects of programming, learning to use regular expressions requires experimentation. Start by using the basics, and ...

Get Bioinformatics Programming Using Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.