FSTCompiler
Builds a minimal FST (maps an IntsRef term to an arbitrary output) from pre-sorted terms with outputs. The FST becomes an FSA if you use NoOutputs. The FST is written on-the-fly into a compact serialized format byte array, which can be saved to / loaded from a Directory or used directly for traversal. The FST is always finite (no cycles).
NOTE: The algorithm is described at http://citeseerx.ist.psu.edu/viewdoc/summarydoi=10.1.1.24.3698
The parameterized type T is the output type. See the subclasses of Outputs.
FSTs larger than 2.1GB are now possible (as of Lucene 4.2). FSTs containing more than 2.1B nodes are also now possible, however they cannot be packed.
It now supports 3 different workflows:
Build FST and use it immediately entirely in RAM and then discard it
Build FST and use it immediately entirely in RAM and also save it to other DataOutput, and load it later and use it
Build FST but stream it immediately to disk (except the FSTMetaData, to be saved at the end). In order to use it, you need to construct the corresponding DataInput and use the FST constructor to read it.
Types
Fluent-style constructor for FST FSTCompiler.
Reusable buffer for building nodes with fixed length arcs (binary search or direct addressing).
Expert: holds a pending (seen but not yet serialized) Node.
Properties
Functions
Add the next input/output pair. The provided input must be sorted after the previous one according to IntsRef.compareTo. It's also OK to add the same input twice in a row with different outputs, as long as Outputs implements the Outputs.merge method. Note that input is fully consumed after this method is returned (so caller is free to reuse), but output is not. So if your outputs are changeable (eg ByteSequenceOutputs or [ ]) then you cannot reuse across calls.
Returns the metadata of the final FST. NOTE: this will return null if nothing is accepted by the FST themselves.
Get the respective FSTReader of the DataOutput. To call this method, you need to use the default DataOutput or .getOnHeapReaderWriter, otherwise we will throw an exception.