FST

class FST<T> : Accountable

Represents an finite state machine (FST), using a compact byte[] format.

The format is similar to what's used by Morfologik (https://github.com/morfologik/morfologik-stemming).

See the package documentation for some simple examples.

Constructors

Link copied to clipboard
constructor(metadata: FST.FSTMetadata<T>, in: DataInput)

Load a previously saved FST with a DataInput for metdata using an OnHeapFSTStore with maxBlockBits set to .DEFAULT_MAX_BLOCK_BITS

Types

Link copied to clipboard
class Arc<T>

Represents a single arc.

Link copied to clipboard
abstract class BytesReader : DataInput

Reads bytes stored in an FST.

Link copied to clipboard
object Companion
Link copied to clipboard
class FSTMetadata<T>(val inputType: FST.INPUT_TYPE, val outputs: Outputs<T>, var emptyOutput: T?, startNode: Long, version: Int, numBytes: Long)

Represents the FST metadata.

Link copied to clipboard

Specifies allowed range of each int input label for this FST.

Properties

Link copied to clipboard

Returns nested resources of this class. The result should be a point-in-time snapshot (to avoid race conditions).

Link copied to clipboard
Link copied to clipboard

Functions

Link copied to clipboard
fun findTargetArc(labelToMatch: Int, follow: FST.Arc<T>, arc: FST.Arc<T>, in: FST.BytesReader): FST.Arc<T>?

Finds an arc leaving the incoming arc, replacing the arc in place. This returns null if the arc was not found, else the incoming arc.

Link copied to clipboard

Returns a BytesReader for this FST, positioned at position 0.

Link copied to clipboard
Link copied to clipboard

Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node

Link copied to clipboard

Returns whether arc's target points to a node in expanded format (fixed length arcs).

Link copied to clipboard
fun numBytes(): Long
Link copied to clipboard
open override fun ramBytesUsed(): Long

Return the memory usage of this object in bytes. Negative values are illegal.

Link copied to clipboard
fun readArcByContinuous(arc: FST.Arc<T>, in: FST.BytesReader, rangeIndex: Int): FST.Arc<T>

Reads a Continuous node arc, with the provided index in the label range.

Link copied to clipboard

Reads a present direct addressing node arc, with the provided index in the label range.

Link copied to clipboard
Link copied to clipboard
fun readFirstRealTargetArc(nodeAddress: Long, arc: FST.Arc<T>, in: FST.BytesReader): FST.Arc<T>
Link copied to clipboard

Follow the follow arc and read the first arc of its target; this changes the provided arc (2nd arg) in-place and returns it.

Link copied to clipboard

Reads one BYTE1/2/4 label from the provided DataInput.

Link copied to clipboard

Reads the last arc of a continuous node.

Link copied to clipboard

Reads the last arc of a direct addressing node. This method is equivalent to call .readArcByDirectAddressing with rangeIndex equal to arc.numArcs() - 1, but it is faster.

Link copied to clipboard

Follows the follow arc and reads the last arc of its target; this changes the provided arc (2nd arg) in-place and returns it.

Link copied to clipboard

In-place read; returns the arc.

Link copied to clipboard

Peeks at next arc's label; does not alter arc. Do not call this if arc.isLast()!

Link copied to clipboard

Never returns null, but you should never call this if arc.isLast() is true.

Link copied to clipboard
fun save(path: Path)

Writes an automaton to a file.

fun save(metaOut: DataOutput, out: DataOutput)

Save the FST to DataOutput.

Link copied to clipboard
open override fun toString(): String