BibleGermanAnalyzer

Analyzer for German bible text with Jesus/Christus declension normalization.

The German Bible traditionally kept the Latin declensions for the name of Jesus. Since the New Testament was heavily influenced by Latin scholarship in the days of Martin Luther, the name was declined like this:

Nominative (Subject): Jesus Christus Genitive (Possessive): Jesu Christi Dative (Indirect Object): Jesu Christo Accusative (Direct Object): Jesum Christum

In modern German, we usually just say "von Jesus Christus" to show possession, but the classic Luther Bible preserves these beautiful, older linguistic forms that signal "of" without needing a separate word.

This class is to normalizes all the different forms of Jesus Christ in both classic Luter version of German and also modern German.

Constructors

Link copied to clipboard
constructor()

Builds an analyzer with the default stop words.

constructor(stopwords: CharArraySet)

Builds an analyzer with the given stop words.

constructor(stopwords: CharArraySet, stemExclusionSet: CharArraySet)

Builds an analyzer with the given stop words and a stemming exclusion set.

Properties

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard

Functions

Link copied to clipboard
open override fun close()
Link copied to clipboard
open fun getOffsetGap(fieldName: String?): Int
Link copied to clipboard
open fun getPositionIncrementGap(fieldName: String?): Int
Link copied to clipboard
fun normalize(fieldName: String, text: String): BytesRef
Link copied to clipboard
fun tokenStream(fieldName: String, text: String): TokenStream
fun tokenStream(fieldName: String, reader: Reader): TokenStream