UnicodeWhitespaceTokenizer
constructor()
Construct a new UnicodeWhitespaceTokenizer.
Construct a new UnicodeWhitespaceTokenizer using a given AttributeFactory.
Parameters
factory
the attribute factory to use for this Tokenizer
Construct a new UnicodeWhitespaceTokenizer using a given AttributeFactory.
Parameters
factory
the attribute factory to use for this Tokenizer
maxTokenLen
maximum token length the tokenizer will emit. Must be greater than 0 and less than MAX_TOKEN_LENGTH_LIMIT (1024*1024)
Throws
if maxTokenLen is invalid.