Obtaining TokenAttribute values

With a TokenStream, we can look at how token values are retrieved. From a high level, TokenStream is an enumeration of tokens. To access the values, we will provide TokenStream with one or more attribute objects. Note that there is only one instance that exists per attribute. This is for performance reasons so we are not creating objects in each iteration; instead, the same attribute instances are updated when we increment the token.

Getting ready

There are several types of attributes; each type provides a different aspect, or metadata, of a token. Here is a list of attributes we will review in this section.

This is the token attribute interface description:

  • CharTermAttribute: This exposes a token's actual textual value, ...

Get Lucene 4 Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.