DelimitedTermFrequencyTokenFilter

class DelimitedTermFrequencyTokenFilter(input: TokenStream, delimiter: Char = DEFAULT_DELIMITER) : TokenFilter

Characters before the delimiter are the "token", the textual integer after is the term frequency. To use this TokenFilter the field must be indexed with but no positions or offsets.

For example, if the delimiter is '|', then for the string "foo|5", "foo" is the token and "5" is a term frequency. If there is no delimiter, the TokenFilter does not modify the term frequency.

Note make sure your Tokenizer doesn't split on the delimiter, or this won't work

Constructors

Link copied to clipboard
constructor(input: TokenStream, delimiter: Char = DEFAULT_DELIMITER)

Types

Link copied to clipboard
object Companion

Properties

Functions

Link copied to clipboard
fun <T : Attribute> addAttribute(attClass: KClass<T>): T
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
open override fun close()
Link copied to clipboard
fun copyTo(target: AttributeSource)
Link copied to clipboard
open override fun end()
Link copied to clipboard
Link copied to clipboard
open operator override fun equals(obj: Any?): Boolean
Link copied to clipboard
fun <T : Attribute> getAttribute(attClass: KClass<T>): T?
Link copied to clipboard
fun hasAttribute(attClass: KClass<out Attribute>): Boolean
Link copied to clipboard
Link copied to clipboard
open override fun hashCode(): Int
Link copied to clipboard
open override fun incrementToken(): Boolean
Link copied to clipboard
fun reflectAsString(prependAttClass: Boolean): String
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
open override fun reset()
Link copied to clipboard
Link copied to clipboard
open override fun toString(): String
Link copied to clipboard
open override fun unwrap(): TokenStream