ShingleAnalyzerWrapper

constructor(defaultAnalyzer: Analyzer)
constructor(defaultAnalyzer: Analyzer, maxShingleSize: Int)
constructor(defaultAnalyzer: Analyzer, minShingleSize: Int, maxShingleSize: Int)


constructor(delegate: Analyzer, minShingleSize: Int, maxShingleSize: Int, tokenSeparator: String?, outputUnigrams: Boolean, outputUnigramsIfNoShingles: Boolean, fillerToken: String)

Creates a new ShingleAnalyzerWrapper

Parameters

delegate

Analyzer whose TokenStream is to be filtered

minShingleSize

Min shingle (token ngram) size

maxShingleSize

Max shingle size

tokenSeparator

Used to separate input stream tokens in output shingles

outputUnigrams

Whether or not the filter shall pass the original tokens to the output stream

outputUnigramsIfNoShingles

Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.

fillerToken

filler token to use when positionIncrement is more than 1


constructor()
constructor(minShingleSize: Int, maxShingleSize: Int)

Wraps StandardAnalyzer.