Implementation of FST in Lucene

The algorithm used for implementing an FST in Lucene is based on the paper Direct Construction of Minimal Acyclic Subsequential Transducers published by Stoyan Mihov and Denis Maurel. This algorithm can be used to build a minimal acyclic sub-sequential transducer (a type of FST) that represents a finite relation, given a sorted list of input words and their outputs. As this algorithm constructs the minimal transducer directly, it has better efficiency than other algorithms. It is the perfect fit for a Lucene FST, as all the terms in the Lucene index are stored in a sorted order.

Note

The following paper was referred to for building the algorithm: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.3698.

An FST ...

Get Apache Solr Search Patterns now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.