CollectionStatistics

data class CollectionStatistics(val field: String?, val maxDoc: Long, val docCount: Long, val sumTotalTermFreq: Long, val sumDocFreq: Long)

Contains statistics for a collection (field).

This class holds statistics across all documents for scoring purposes:

  • .maxDoc: number of documents.

  • .docCount: number of documents that contain this field.

  • .sumDocFreq: number of postings-list entries.

  • .sumTotalTermFreq: number of tokens.

The following conditions are always true:

  • All statistics are positive integers: never zero or negative.

  • docCount<= maxDoc

  • docCount<= sumDocFreq<= sumTotalTermFreq

Values may include statistics on deleted documents that have not yet been merged away.

Be careful when performing calculations on these values because they are represented as 64-bit integer values, you may need to cast to double for your use.

Parameters

field

Field's name.

This value is never null.

maxDoc

The total number of documents in the range [1 .. Long.MAX_VALUE], regardless of whether they all contain values for this field.

This value is always a positive number. @see IndexReader#maxDoc()

docCount

The total number of documents that have at least one term for this field , in the range [1 .. .maxDoc].

This value is always a positive number, and never exceeds .maxDoc. @see Terms#getDocCount()

sumTotalTermFreq

The total number of tokens for this field , in the range [.sumDocFreq .. Long.MAX_VALUE]. This is the "word count" for this field across all documents. It is the sum of TermStatistics.totalTermFreq across all terms. It is also the sum of each document's field length across all documents.

This value is always a positive number, and always at least .sumDocFreq. @see Terms#getSumTotalTermFreq()

sumDocFreq

The total number of posting list entries for this field, in the range [.docCount .. .sumTotalTermFreq]. This is the sum of term-document pairs: the sum of TermStatistics.docFreq across all terms. It is also the sum of each document's unique term count for this field across all documents.

This value is always a positive number, always at least .docCount, and never exceeds .sumTotalTermFreq. @see Terms#getSumDocFreq()

Constructors

Link copied to clipboard
constructor(field: String?, maxDoc: Long, docCount: Long, sumTotalTermFreq: Long, sumDocFreq: Long)

Properties

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard