9.3. Tokenizers in Standard Java

The standard Java libraries include two tokenizers: StringTokenizer in java.util and StreamTokenizer in java.io.

The StringTokenizer class does not parse numbers, and it allows little customization. This tokenizer is suitable only for simple tokenization, and this book does not discuss it further.

The StreamTokenizer class is more customizable than StringTokenizer but lacks some desirable features. In particular, StreamTokenizer in Java 1.1.7 does not provide

  • A Token class to encapsulate token results

  • Customization of how to recognize numbers

  • The ability to define new token types

  • Differentiation of allowable characters for the start of a word from allowable characters within a word

  • Handling of multicharacter symbols ...

Get Building Parsers with Java™ now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.