9.12. Summary

Tokenizing text lets you simplify grammars so that they define patterns of tokens instead of patterns of individual characters. A tokenizer must have a default state along with a set of other states to enter, depending on the next character to consume. Once entered, a tokenizing state needs to arrange to consume and return one token, although it can delegate this task to another state. You can customize which state a tokenizer enters given an initial character, and you can customize how a state builds a token. You can also create your own tokenizing states, so you have a great deal of freedom in customizing a tokenizer to meet the needs of your language.

Get Building Parsers with Java™ now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.