ShingleAnalyzerWrapper
constructor(delegate: Analyzer, minShingleSize: Int, maxShingleSize: Int, tokenSeparator: String?, outputUnigrams: Boolean, outputUnigramsIfNoShingles: Boolean, fillerToken: String)
Creates a new ShingleAnalyzerWrapper
Parameters
delegate
Analyzer whose TokenStream is to be filtered
minShingleSize
Min shingle (token ngram) size
maxShingleSize
Max shingle size
tokenSeparator
Used to separate input stream tokens in output shingles
outputUnigrams
Whether or not the filter shall pass the original tokens to the output stream
outputUnigramsIfNoShingles
Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.
fillerToken
filler token to use when positionIncrement is more than 1
constructor()
Wraps StandardAnalyzer.