Tokenizing a Character Stream

Example 3-6 was a Reader implementation wrapped around another Reader. ReaderTokenizer (Example 3-7) is a Tokenizer implementation wrapped around a Reader. The Tokenizer interface was shown in Example 2-8, and the ReaderTokenizer class shown here is a subclass of the AbstractTokenizer class of Example 2-9.

As its name implies, ReaderTokenizer tokenizes the text it reads from a Reader stream. The class implements the abstract createBuffer( ) and fillBuffer( ) methods of its superclass, and you may want to reread Example 2-9 to refresh your memory about the interactions between ReaderTokenizer and AbstractTokenizer.

Example 3-7 includes an inner class named Test that reads and tokenizes characters from a FileReader, listing the tokens read on the standard output. It also writes the text of each token to a FileWriter, producing a copy of the input file and demonstrating that the tokenizer accounts for every character of the input file (as long as it is not configured to discard spaces, that is). Like its superclass, ReaderTokenizer uses the assert keyword, and must be compiled with the -source 1.4 option to javac.

Example 3-7. ReaderTokenizer.java

package je3.io; import je3.classes.Tokenizer; import je3.classes.AbstractTokenizer; import java.io.*; /** * This Tokenizer implementation extends AbstractTokenizer to tokenize a stream * of text read from a java.io.Reader. It implements the createBuffer( ) and * fillBuffer( ) methods required by AbstractTokenizer. ...

Get Java Examples in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.