Tokenizing Byte Buffers

In this section we present an abstract Tokenizer implementation (see Example 2-8) for tokenizing the contents of a ByteBuffer. The subsections that follow include concrete subclasses for tokenizing characters read from a memory-mapped file or an arbitrary Channel. The ByteBufferTokenizer class in Example 6-6 extends the AbstractTokenizer class of Example 2-9. You may want to reread that example before starting in on this one.

As you recall, the AbstractTokenizer class has abstract methods it calls when it needs more characters to tokenize. The ByteBufferTokenizer class implements these methods to get more characters by using a CharsetDecoder to decode bytes from a ByteBuffer, but it defines and calls new abstract methods when it needs to get more bytes into the ByteBuffer.

As with AbstractTokenizer, the code for this class is a little dense; it is intended as a moderately advanced example. The most interesting thing to note about this example is the use of the CharsetDecoder. Notice how it is obtained from the Charset object, how its error behavior is configured, how the decode( ) method is called, and how the return values of that method are handled. It is useful to compare the use of the CharsetDecoder in this example with the decoding loop of Example 6-5.

Example 6-6. ByteBufferTokenizer.java

package je3.nio; import java.nio.*; import java.nio.charset.*; import java.io.IOException; import je3.classes.AbstractTokenizer; /** * This is an abstract Tokenizer ...

Get Java Examples in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.