Lucene94FieldInfosFormat
Lucene 9.0 Field Infos format.
Field names are stored in the field info file, with suffix .fnm.
FieldInfos (.fnm) --> Header,FieldsCount,
Data types:
Header -->IndexHeader
FieldsCount -->DataOutput.writeVInt
FieldName -->DataOutput.writeString
FieldBits, IndexOptions, DocValuesBits -->DataOutput.writeByte
FieldNumber, DimensionCount, DimensionNumBytes -->DataOutput.writeInt
Attributes -->DataOutput.writeMapOfStrings
DocValuesGen -->DataOutput.writeLong
Footer -->CodecFooter
Field Descriptions:
FieldsCount: the number of fields in this file.
FieldName: name of the field as a UTF-8 String.
FieldNumber: the field's number. Note that unlike previous versions of Lucene, the fields are not numbered implicitly by their order in the file, instead explicitly.
FieldBits: a byte containing field options.
The low order bit (0x1) is one for fields that have term vectors stored, and zero for fields without term vectors.
If the second lowest order-bit is set (0x2), norms are omitted for the indexed field.
If the third lowest-order bit is set (0x4), payloads are stored for the indexed field.
IndexOptions: a byte containing index options.
0: not indexed
1: indexed as DOCS_ONLY
2: indexed as DOCS_AND_FREQS
3: indexed as DOCS_AND_FREQS_AND_POSITIONS
4: indexed as DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
DocValuesBits: a byte containing per-document value types. The type recorded as two four-bit integers, with the high-order bits representing
normsoptions, and the low-order bits representingDocValuesoptions. Each four-bit integer can be decoded as such:0: no DocValues for this field.
1: NumericDocValues. (DocValuesType.NUMERIC)
2: BinaryDocValues. (
DocValuesType#BINARY)3: SortedDocValues. (
DocValuesType#SORTED)DocValuesGen is the generation count of the field's DocValues. If this is -1, there are no DocValues updates to that field. Anything above zero means there are updates stored by DocValuesFormat.
Attributes: a key-value map of codec-private attributes.
PointDimensionCount, PointNumBytes: these are non-zero only if the field is indexed as points, e.g. using org.gnit.lucenekmp.document.LongPoint
VectorDimension: it is non-zero if the field is indexed as vectors.
VectorEncoding: a byte containing the encoding of vector values:
0: BYTE. Samples are stored as signed bytes
1: FLOAT32. Samples are stored in IEEE 32-bit floating point format.
VectorSimilarityFunction: a byte containing distance function used for similarity calculation.
0: EUCLIDEAN distance. (VectorSimilarityFunction.EUCLIDEAN)
1: DOT_PRODUCT similarity. (VectorSimilarityFunction.DOT_PRODUCT)
2: COSINE similarity. (VectorSimilarityFunction.COSINE)
Functions
Read the FieldInfos previously written with .write.
Writes the provided FieldInfos to the directory.