WhitespaceTokenizer

A tokenizer that divides text at whitespace characters as defined by . Note: That definition explicitly excludes the non-breaking space. Adjacent sequences of non-Whitespace characters form tokens.

See also

Constructors

Link copied to clipboard
constructor()

Construct a new WhitespaceTokenizer.

constructor(factory: AttributeFactory)

Construct a new WhitespaceTokenizer using a given [ ].

constructor(maxTokenLen: Int)

Construct a new WhitespaceTokenizer using a given max token length

constructor(factory: AttributeFactory, maxTokenLen: Int)

Construct a new WhitespaceTokenizer using a given [ ].

Types

Link copied to clipboard
object Companion

Properties

Functions

Link copied to clipboard
fun <T : Attribute> addAttribute(attClass: KClass<T>): T
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
open override fun close()
Link copied to clipboard
fun copyTo(target: AttributeSource)
Link copied to clipboard
open override fun end()
Link copied to clipboard
Link copied to clipboard
open operator override fun equals(obj: Any?): Boolean
Link copied to clipboard
fun <T : Attribute> getAttribute(attClass: KClass<T>): T?
Link copied to clipboard
fun hasAttribute(attClass: KClass<out Attribute>): Boolean
Link copied to clipboard
Link copied to clipboard
open override fun hashCode(): Int
Link copied to clipboard
open override fun incrementToken(): Boolean
Link copied to clipboard
fun reflectAsString(prependAttClass: Boolean): String
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
open override fun reset()
Link copied to clipboard
Link copied to clipboard
fun setReader(input: Reader)
Link copied to clipboard
open override fun toString(): String