Lucene99FlatVectorsFormat
Lucene 9.9 flat vector format, which encodes numeric vector values
.vec (vector data) file
For each field:
Vector data ordered by field, document ordinal, and vector dimension. When the vectorEncoding is BYTE, each sample is stored as a single byte. When it is FLOAT32, each sample is stored as an IEEE float in little-endian byte order.
DocIds encoded by IndexedDISI.writeBitSet, note that only in sparse case
OrdToDoc was encoded by org.gnit.lucenekmp.util.packed.DirectMonotonicWriter, note that only in sparse case
.vemf (vector metadata) file
For each field:
int32 field number
int32 vector similarity function ordinal
vlong offset to this field's vectors in the .vec file
vlong length of this field's vectors, in bytes
vint dimension of this field's vectors
int the number of documents having values for this field
int8 if equals to -2, empty - no vector values. If equals to -1, dense – all documents have values for a field. If equals to 0, sparse – some documents missing values.
DocIds were encoded by IndexedDISI.writeBitSet
OrdToDoc was encoded by org.gnit.lucenekmp.util.packed.DirectMonotonicWriter, note that only in sparse case
Functions
Returns a KnnVectorsReader to read the vectors from the index.
Returns a FlatVectorsWriter to write the vectors to the index.
Returns the maximum number of vector dimensions supported by this codec for the given field name