Companion

object Companion

Properties

Link copied to clipboard
const val AVOID_BAD_URL: Int = 2
Link copied to clipboard
const val EMAIL_TYPE: Int

Email token type

Link copied to clipboard
const val EMOJI_TYPE: Int

Emoji token type

Link copied to clipboard
const val HANGUL_TYPE: Int

Hangul token type

Link copied to clipboard
const val HIRAGANA_TYPE: Int

Hiragana token type

Link copied to clipboard

Ideographic token type

Link copied to clipboard
const val KATAKANA_TYPE: Int

Katakana token type

Link copied to clipboard
const val NUMERIC_TYPE: Int

Numbers

Link copied to clipboard

Chars in class \p{Line_Break = Complex_Context} are from South East Asian scripts (Thai, Lao, Myanmar, Khmer, etc.). Sequences of these are kept together as as a single token rather than broken up, because the logic required to break them at word boundaries is too complex for UAX#29.

Link copied to clipboard
const val URL_TYPE: Int

URL token type

Link copied to clipboard
const val WORD_TYPE: Int

Alphanumeric sequences

Link copied to clipboard
const val YYEOF: Int

This character denotes the end of file.

Link copied to clipboard
const val YYINITIAL: Int = 0

Lexical States.