MergePolicy

abstract class MergePolicy

Expert: a MergePolicy determines the sequence of primitive merge operations.

Whenever the segments in an index have been altered by IndexWriter, either the addition of a newly flushed segment, addition of many segments from addIndexes* calls, or a previous merge that may now need to cascade, IndexWriter invokes .findMerges to give the MergePolicy a chance to pick merges that are now required. This method returns a [ ] instance describing the set of merges that should be done, or null if no merges are necessary. When IndexWriter.forceMerge is called, it calls .findForcedMerges and the MergePolicy should then return the necessary merges.

Note that the policy can return more than one merge at a time. In this case, if the writer is using SerialMergeScheduler, the merges will be run sequentially but if it is using [ ] they will be run concurrently.

The default MergePolicy is TieredMergePolicy.

Inheritors

Types

Link copied to clipboard
object Companion
Link copied to clipboard
class MergeAbortedException : IOException

Thrown when a merge was explicitly aborted because IndexWriter.abortMerges was called. Normally this exception is privately caught and suppressed by IndexWriter.

Link copied to clipboard
interface MergeContext

This interface represents the current context of the merge selection process. It allows to access real-time information like the currently merging segments or how many deletes a segment would claim back if merged. This context might be stateful and change during the execution of a merge policy's selection processes.

Link copied to clipboard

Exception thrown if there are any problems while executing a merge.

Link copied to clipboard
Link copied to clipboard

A MergeSpecification instance provides the information necessary to perform multiple merges. It simply contains a list of OneMerge instances.

Link copied to clipboard
open class OneMerge

OneMerge provides the information necessary to perform an individual primitive merge operation, resulting in a single new segment. The merge spec includes the subset of segments to be merged as well as whether the new segment should use the compound file format.

Link copied to clipboard

Progress and state for an executing merge. This class encapsulates the logic to pause and resume the merge thread or to abort the merge entirely.

Properties

Link copied to clipboard
Link copied to clipboard
open var noCFSRatio: Double

Functions

Link copied to clipboard

Determine what set of merge operations is necessary in order to expunge all deletes from the index.

Link copied to clipboard
abstract fun findForcedMerges(segmentInfos: SegmentInfos?, maxSegmentCount: Int, segmentsToMerge: MutableMap<SegmentCommitInfo, Boolean>?, mergeContext: MergePolicy.MergeContext?): MergePolicy.MergeSpecification?

Determine what set of merge operations is necessary in order to merge to <= the specified segment count. IndexWriter calls this when its IndexWriter.forceMerge method is called. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

Link copied to clipboard

Identifies merges that we want to execute (synchronously) on commit. By default, this will return .findMerges whose segments are all less than the .maxFullFlushMergeSize.

Link copied to clipboard

Define the set of merge operations to perform on provided codec readers in .

abstract fun findMerges(mergeTrigger: MergeTrigger?, segmentInfos: SegmentInfos?, mergeContext: MergePolicy.MergeContext?): MergePolicy.MergeSpecification?

Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

Link copied to clipboard

Returns true if the segment represented by the given CodecReader should be kept even if it's fully deleted. This is useful for testing of for instance if the merge policy implements retention policies for soft deletes.

Link copied to clipboard

Return the maximum size of segments to be included in full-flush merges by the default implementation of .findFullFlushMerges.

Link copied to clipboard
open fun numDeletesToMerge(info: SegmentCommitInfo, delCount: Int, readerSupplier: IOSupplier<CodecReader>): Int

Returns the number of deletes that a merge would claim on the given segment. This method will by default return the sum of the del count on disk and the pending delete count. Yet, subclasses that wrap merge readers might modify this to reflect deletes that are carried over to the target segment in the case of soft deletes.

Link copied to clipboard
open fun size(info: SegmentCommitInfo, mergeContext: MergePolicy.MergeContext): Long

Return the byte size of the provided SegmentCommitInfo, prorated by percentage of non-deleted documents.

Link copied to clipboard
open fun useCompoundFile(infos: SegmentInfos, mergedInfo: SegmentCommitInfo, mergeContext: MergePolicy.MergeContext): Boolean

Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returns true iff the size of the given mergedInfo is less or equal to .getMaxCFSSegmentSizeMB and the size is less or equal to the TotalIndexSize * .getNoCFSRatio otherwise false.