Filters

Like tokenizers, filters consume tokens as input and again produce a stream of tokens. The function of a filter is a bit different from that of a tokenizer. Unlike a tokenizer, a filter receives tokens as the input (passed by a tokenizer), and its function is to look at each token and decide whether to keep this token, change/replace it, or discard it. Filters are also derive from org.apache.lucene.analysis.TokenStream.

A typical example of a filter looks something like this:

<fieldType name="text" class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

Filters are configured in schema.xml with ...

Get Apache Solr for Indexing Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.