Strategies for Parsing Text in Python

In the grand scheme of things, there are a variety of ways to handle text processing in Python:

  • Built-in string objects

  • String module (and method) utilities

  • Regular expression matching

  • Parser-generator integrations

  • Hand-coded parsers

  • Running Python code with eval and exec built-ins

For simpler tasks, Python’s built-in string object is often all we really need. Python strings can be indexed, concatenated, sliced, and processed both with built-in functions and with the string module. Our emphasis in this chapter, though, is on higher-level tools and techniques for analyzing textual information. Let’s briefly explore each of the other approaches above with representative examples.

Get Programming Python, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.