IndexWriterConfig
Holds all the configuration that is used to create an IndexWriter. Once [ ] has been created with this object, changes to this object will not affect the [ ] instance. For that, use LiveIndexWriterConfig that is returned from IndexWriter.getConfig.
All setter methods return IndexWriterConfig to allow chaining settings conveniently, for example:
IndexWriterConfig conf = new IndexWriterConfig(analyzer); conf.setter1().setter2();*
Since
3.1
See also
Types
Properties
if an indexing thread should check for pending flushes on update in order to help out on a full flush
True if calls to IndexWriter.close should first do a commit.
Compatibility version to use for this index.
FlushPolicy to control when segments are flushed.
The field names involved in the index sort
InfoStream for debugging messages.
The comparator for sorting leaf readers.
Returns the number of buffered added documents that will trigger a flush if enabled.
Amount of time to wait for merges returned by MergePolicy.findFullFlushMerges(...)
MergePolicy for selecting merges.
MergeScheduler to use for running merges.
OpenMode that IndexWriter is opened with.
parent document field
Expert: Sets the maximum memory consumption per thread triggering a forced flush if exceeded. A DocumentsWriterPerThread is forcefully flushed once it exceeds this limit even if the .getRAMBufferSizeMB has not been exceeded. This is a safety limit to prevent a [ ] from address space exhaustion due to its internal 32 bit signed integer based memory addressing. The given value must be less that 2GB (2048MB)
Returns the value set by .setRAMBufferSizeMB if enabled.
Sets the hard upper bound on RAM usage for a single segment, after which the segment is forced to flush.
True if readers should be pooled.
Similarity to use when encoding norms.
soft deletes field
True if segment flushes should use compound file format
Functions
Returns the current merged segment warmer. See IndexReaderWarmer.
Expert: sets if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to disk. As a consequence, threads calling or IndexWriter.flush will be the only thread writing segments to disk unless flushes are falling behind. If indexing is stalled due to too many pending flushes indexing threads will help our writing pending segment flushes to disk.
Set the Codec.
Sets if calls IndexWriter.close should first commit before closing. Use true * to match behavior of Lucene 4.x.
Expert: Controls when segments are flushed to disk during indexing. The FlushPolicy initialized during IndexWriter instantiation and once initialized the given instance is bound to this IndexWriter and should not be used with another writer.
Expert: allows to open a certain commit point. The default is null which opens the latest commit point. This can also be used to open IndexWriter from a near-real-time reader, if you pass the reader's DirectoryReader.getIndexCommit.
Expert: set the compatibility version to use for this index. In case the index is created, it will use the given major version for compatibility. It is sometimes useful to set the previous major version for compatibility due to the fact that IndexWriter.addIndexes only accepts indices that have been written with the same major version as the current index. If the index already exists, then this value is ignored. Default value is the major of the latest version.
Expert: allows an optional IndexDeletionPolicy implementation to be specified. You can use this to control when prior commits are deleted from the index. The default policy is [ ] which removes all prior commits as soon as a new commit is done (this matches behavior before 2.2). Creating your own policy can allow you to explicitly keep previous "point in time" commits alive in the index for some time, to allow readers to refresh to the new commit without having the old commit deleted out from under them. This is necessary on filesystems like NFS that do not support "delete on last close" semantics, which Lucene's "point in time" search normally relies on.
Set the Sort order to use for all (flushed and merged) segments.
Sets the IndexWriter this config is attached to.
Set event listener to record key events in IndexWriter
Convenience method that uses PrintStreamInfoStream. Must not be null.
Information about merges, deletes and a message when maxFieldLength is reached will be printed to this. Must not be null, but InfoStream.NO_OUTPUT may be used to suppress output.
Set the comparator for sorting leaf readers. A DirectoryReader opened from a IndexWriter with this configuration will have its leaf readers sorted with the provided leaf sorter.
Determines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment. Large values generally give faster indexing.
Expert: sets the amount of time to wait for merges (during IndexWriter.commit or ) returned by MergePolicy.findFullFlushMerges(...). If this time is reached, we proceed with the commit based on segments merged up to that point. The merges are not aborted, and will still run to completion independent of the commit or getReader call, like natural segment merges. The default is {@value IndexWriterConfig#DEFAULT_MAX_FULL_FLUSH_MERGE_WAIT_MILLIS}.
Set the merged segment warmer. See IndexReaderWarmer.
Expert: MergePolicy is invoked whenever there are changes to the segments in the index. Its role is to select which merges to do, if any, and return a [ ] describing the merges. It also selects merges to do for forceMerge.
Expert: sets the merge scheduler used by this writer. The default is [ ].
Specifies OpenMode of the index.
Sets the parent document field. If this optional property is set, IndexWriter will add an internal field to every root document added to the index writer. A document is considered a parent document if it's the last document in a document block indexed via or IndexWriter.updateDocuments and its relatives. Additionally, all individual documents added via the single document methods (IndexWriter.addDocuments etc.) are also considered parent documents. This property is optional for all indices that don't use document blocks in combination with index sorting. In order to maintain the API guarantee that the document order of a block is not altered by the IndexWriter a marker for parent documents is required.
Determines the amount of RAM that may be used for buffering added documents and deletions before they are flushed to the Directory. Generally for faster indexing performance it's best to flush by RAM usage instead of document count and use as large a RAM buffer as you can.
By default, IndexWriter does not pool the SegmentReaders it must open for deletions and merging, unless a near-real-time reader has been obtained by calling . This method lets you enable pooling without getting a near-real-time reader. NOTE: if you set this to false, IndexWriter will still pool readers once DirectoryReader.open is called.
Expert: set the Similarity implementation used by this IndexWriter.
Sets the soft-deletes field. A soft-delete field in Lucene is a doc-values field that marks a document as soft-deleted, if a document has at least one value in that field. If a document is marked as soft-deleted, the document is treated as if it has been hard-deleted through the IndexWriter API (IndexWriter.deleteDocuments. Merges will reclaim soft-deleted as well as hard-deleted documents, and index readers obtained from the IndexWriter will reflect all deleted documents in its live docs. If soft-deletes are used, documents must be indexed via IndexWriter.softUpdateDocument. Deletes are applied via IndexWriter.updateDocValues.
Sets if the IndexWriter should pack newly written segments in a compound file. Default is true.