Parser Generators

If you have any background in parsing theory, you may know that neither regular expressions nor string splitting is powerful enough to handle more complex language grammars (roughly, they don’t have the “memory” required by true grammars). For more sophisticated language analysis tasks, we sometimes need a full-blown parser. Since Python is built for integrating C tools, we can write integrations to traditional parser generator systems such as yacc and bison. Better yet, we could use an integration that already exists.

There are also Python-specific parsing systems accessible from Python’s web site. Among them, the kwParsing system, developed by Aaron Watters, is a parser generator written in Python, and the SPARK toolkit, developed by John Aycock, is a lightweight system that employs the Earley algorithm to work around technical problems with LALR parser generation (if you don’t know what that means, you probably don’t need to care). Since these are all complex tools, though, we’ll skip their details in this text. Consult http://www.python.org for information on parser generator tools available for use in Python programs.

Get Programming Python, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.