UAX29URLEmailAnalyzer

Constructors

Link copied to clipboard
constructor(stopWords: CharArraySet)

Builds an analyzer with the given stop words.

constructor()

Builds an analyzer with the default stop words (STOP_WORDS_SET).

constructor(stopwords: Reader)

Builds an analyzer with the stop words from the given reader.

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard

Functions

Link copied to clipboard
open override fun close()
Link copied to clipboard
Link copied to clipboard
open fun getOffsetGap(fieldName: String?): Int
Link copied to clipboard
open fun getPositionIncrementGap(fieldName: String?): Int
Link copied to clipboard
fun normalize(fieldName: String, text: String): BytesRef
Link copied to clipboard
fun setMaxTokenLength(length: Int)

Set the max allowed token length. Tokens larger than this will be chopped up at this token length and emitted as multiple tokens. If you need to skip such large tokens, you could increase this max length, and then use LengthFilter to remove long tokens. The default is UAX29URLEmailAnalyzer.DEFAULT_MAX_TOKEN_LENGTH.

Link copied to clipboard
fun tokenStream(fieldName: String, text: String): TokenStream
fun tokenStream(fieldName: String, reader: Reader): TokenStream