LogDocMergePolicy

This is a LogMergePolicy that measures size of a segment as the number of documents (not taking deletions into account).

Constructors

Link copied to clipboard
constructor()

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard

If true, we pro-rate a segment's size by the percentage of non-deleted documents.

Link copied to clipboard
Link copied to clipboard

If a segment has more than this many documents then it will never be merged.

Link copied to clipboard

If the size of a segment exceeds this value then it will never be merged.

Link copied to clipboard

How many segments to merge at a time.

Link copied to clipboard
Link copied to clipboard

Any segments whose size is smaller than this value will be candidates for full-flush merges and merged more aggressively.

Link copied to clipboard
open var noCFSRatio: Double
Link copied to clipboard

Target search concurrency. This merge policy will avoid creating segments that have more than maxDoc / targetSearchConcurrency documents.

Functions

Link copied to clipboard

Finds merges necessary to force-merge all deletes from the index. We simply merge adjacent segments that have deletes, up to mergeFactor at a time.

Link copied to clipboard
open override fun findForcedMerges(infos: SegmentInfos?, maxNumSegments: Int, segmentsToMerge: MutableMap<SegmentCommitInfo, Boolean>?, mergeContext: MergePolicy.MergeContext?): MergePolicy.MergeSpecification?

Returns the merges necessary to merge the index down to a specified number of segments. This respects the .maxMergeSizeForForcedMerge setting. By default, and assuming maxNumSegments=1, only one segment will be left in the index, where that segment has no deletions pending nor separate norms, and it is in compound file format if the current useCompoundFile setting is true. This method returns multiple merges (mergeFactor at a time) so the MergeScheduler in use may make use of concurrency.

Link copied to clipboard

Identifies merges that we want to execute (synchronously) on commit. By default, this will return .findMerges whose segments are all less than the .maxFullFlushMergeSize.

Link copied to clipboard
open override fun findMerges(mergeTrigger: MergeTrigger?, infos: SegmentInfos?, mergeContext: MergePolicy.MergeContext?): MergePolicy.MergeSpecification?

Checks if any merges are now necessary and returns a MergePolicy.MergeSpecification if so. A merge is necessary when there are more than .setMergeFactor segments at a given level. When multiple levels have too many segments, this method will return multiple merges, allowing the MergeScheduler to use concurrency.

Define the set of merge operations to perform on provided codec readers in .

Link copied to clipboard

Returns true if the segment represented by the given CodecReader should be kept even if it's fully deleted. This is useful for testing of for instance if the merge policy implements retention policies for soft deletes.

Link copied to clipboard
open override fun maxFullFlushMergeSize(): Long

Return the maximum size of segments to be included in full-flush merges by the default implementation of .findFullFlushMerges.

Link copied to clipboard
open fun numDeletesToMerge(info: SegmentCommitInfo, delCount: Int, readerSupplier: IOSupplier<CodecReader>): Int

Returns the number of deletes that a merge would claim on the given segment. This method will by default return the sum of the del count on disk and the pending delete count. Yet, subclasses that wrap merge readers might modify this to reflect deletes that are carried over to the target segment in the case of soft deletes.

Link copied to clipboard
open override fun size(info: SegmentCommitInfo, mergeContext: MergePolicy.MergeContext): Long

Return the byte size of the provided SegmentCommitInfo, prorated by percentage of non-deleted documents.

Link copied to clipboard
open override fun toString(): String
Link copied to clipboard
open fun useCompoundFile(infos: SegmentInfos, mergedInfo: SegmentCommitInfo, mergeContext: MergePolicy.MergeContext): Boolean

Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returns true iff the size of the given mergedInfo is less or equal to .getMaxCFSSegmentSizeMB and the size is less or equal to the TotalIndexSize * .getNoCFSRatio otherwise false.