BibleKoreanAnalyzer

class BibleKoreanAnalyzer(userDict: UserDictionary? = BibleKoreanUserDictionary.instance, mode: KoreanTokenizer.DecompoundMode = KoreanTokenizer.DEFAULT_DECOMPOUND, stopTags: Set<POS.Tag> = KoreanPartOfSpeechStopFilter.DEFAULT_STOP_TAGS, outputUnknownUnigrams: Boolean = false, stopWords: Set<String> = setOf("의")) : Analyzer

Analyzer for Korean that uses morphological analysis. Adds optional stop words (default includes the possessive particle "의").

Constructors

Link copied to clipboard
constructor(userDict: UserDictionary? = BibleKoreanUserDictionary.instance, mode: KoreanTokenizer.DecompoundMode = KoreanTokenizer.DEFAULT_DECOMPOUND, stopTags: Set<POS.Tag> = KoreanPartOfSpeechStopFilter.DEFAULT_STOP_TAGS, outputUnknownUnigrams: Boolean = false, stopWords: Set<String> = setOf("의"))

Properties

Link copied to clipboard
Link copied to clipboard

Functions

Link copied to clipboard
open override fun close()
Link copied to clipboard
open fun getOffsetGap(fieldName: String?): Int
Link copied to clipboard
open fun getPositionIncrementGap(fieldName: String?): Int
Link copied to clipboard
fun normalize(fieldName: String, text: String): BytesRef
Link copied to clipboard
fun tokenStream(fieldName: String, text: String): TokenStream
fun tokenStream(fieldName: String, reader: Reader): TokenStream