ClassicSimilarity

Expert: Historical scoring implementation. You might want to consider using [ ] instead, which is generally considered superior to TF-IDF.

Constructors

Link copied to clipboard
constructor()

Default constructor: parameter-free

constructor(discountOverlaps: Boolean)

Primary constructor.

Properties

Link copied to clipboard

True if overlap tokens (tokens with a position of increment of zero) are discounted from the document's length.

Functions

Link copied to clipboard

Computes the normalization value for a field at index-time.

Link copied to clipboard
open override fun idf(docFreq: Long, docCount: Long): Float

Implemented as log((docCount+1)/(docFreq+1)) + 1.

Link copied to clipboard
open override fun idfExplain(collectionStats: CollectionStatistics, termStats: TermStatistics): Explanation

Computes a score factor for a simple term and returns an explanation for that score factor.

open fun idfExplain(collectionStats: CollectionStatistics, termStats: Array<TermStatistics>): Explanation

Computes a score factor for a phrase.

Link copied to clipboard
open override fun lengthNorm(numTerms: Int): Float

Implemented as 1/sqrt(length).

Link copied to clipboard
open override fun scorer(boost: Float, collectionStats: CollectionStatistics, vararg termStats: TermStatistics): Similarity.SimScorer

Compute any collection-level weight (e.g. IDF, average document length, etc) needed for scoring a query.

Link copied to clipboard
open override fun tf(freq: Float): Float

Implemented as sqrt(freq).

Link copied to clipboard
open override fun toString(): String