Although toolkits abound in this domain, Python’s status as a general-purpose programming language also makes it a reasonable vehicle for writing hand-coded parsers for custom language analysis tasks. For instance, recursive descent parsing is a fairly well-known technique for analyzing language-based information. Though not as powerful as some language tools, recursive descent parses are sufficient for a wide variety of language-related goals.
To illustrate, this section develops a custom parser for a simple grammar—it parses and evaluates arithmetic expression strings. Though language analysis is the main topic here, this example also demonstrates the utility of Python as a general-purpose programming language. Although Python is often used as a frontend or rapid development language in tactical modes, it’s also often useful for the kinds of strategic work you may have formerly done in a systems development language such as C or C++.
The grammar that our parser will recognize can be described as follows:
goal -> <expr> END [number, variable, ( ] goal -> <assign> END [set] assign -> 'set' <variable> <expr> [set] expr -> <factor> <expr-tail> [number, variable, ( ] expr-tail -> ^ [END, ) ] expr-tail -> '+' <factor> <expr-tail> [+] expr-tail -> '-' <factor> <expr-tail> [-] factor -> <term> <factor-tail> [number, variable, ( ] factor-tail -> ^ [+, -, END, ) ] factor-tail -> '*' <term> <factor-tail> [*] factor-tail -> '/' <term> <factor-tail> ...