Package-level declarations
Types
Link copied to clipboard
Link copied to clipboard
class DictionaryToken(type: TokenType, morphAtts: KoMorphData, wordId: Int, surfaceForm: CharArray, offset: Int, length: Int, startOffset: Int, endOffset: Int) : Token
A token stored in a KoMorphData.
Link copied to clipboard
class KoreanAnalyzer(userDict: UserDictionary? = null, mode: KoreanTokenizer.DecompoundMode = KoreanTokenizer.DEFAULT_DECOMPOUND, stopTags: Set<POS.Tag> = KoreanPartOfSpeechStopFilter.DEFAULT_STOP_TAGS, outputUnknownUnigrams: Boolean = false) : Analyzer
Analyzer for Korean that uses morphological analysis.
Link copied to clipboard
A TokenFilter that normalizes Korean numbers to regular Arabic decimal numbers in half-width characters.
Link copied to clipboard
Factory for KoreanNumberFilter.
Link copied to clipboard
class KoreanPartOfSpeechStopFilter(input: TokenStream, stopTags: Set<POS.Tag> = DEFAULT_STOP_TAGS) : FilteringTokenFilter
Removes tokens that match a set of part-of-speech tags.
Link copied to clipboard
Factory for KoreanPartOfSpeechStopFilter.
Link copied to clipboard
Replaces term text with the ReadingAttribute which is the Hangul transcription of Hanja characters.
Link copied to clipboard
Factory for KoreanReadingFormFilter.
Link copied to clipboard
Tokenizer for Korean that uses morphological analysis.
Link copied to clipboard
Factory for KoreanTokenizer.