TermStatistics

class TermStatistics(term: BytesRef, val docFreq: Long, val totalTermFreq: Long)

Contains statistics for a specific term

This class holds statistics for this term across all documents for scoring purposes:

  • .docFreq: number of documents this term occurs in.

  • .totalTermFreq: number of tokens for this term.

The following conditions are always true:

  • All statistics are positive integers: never zero or negative.

  • docFreq<= totalTermFreq

  • docFreq<= sumDocFreq of the collection

  • totalTermFreq<= sumTotalTermFreq of the collection

Values may include statistics on deleted documents that have not yet been merged away.

Be careful when performing calculations on these values because they are represented as 64-bit integer values, you may need to cast to double for your use.

Parameters

term

Term bytes.

This value is never null.

docFreq

number of documents containing the term in the collection, in the range [1 .. .totalTermFreq].

This is the document-frequency for the term: the count of documents where the term appears at least one time.

This value is always a positive number, and never exceeds .totalTermFreq. It also cannot exceed CollectionStatistics.sumDocFreq. @see TermsEnum#docFreq()

totalTermFreq

number of occurrences of the term in the collection, in the range [.docFreq .. CollectionStatistics.sumTotalTermFreq].

This is the token count for the term: the number of times it appears in the field across all documents.

This value is always a positive number, always at least .docFreq, and never exceeds CollectionStatistics.sumTotalTermFreq. @see TermsEnum#totalTermFreq()

Constructors

Link copied to clipboard
constructor(term: BytesRef, docFreq: Long, totalTermFreq: Long)

Properties

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard