VectorSimilarity
VectorSimilarity copied to clipboard
Raw vectors data layer in HNSW + move to base class
Describe the changes in the pull request
Use the new RawDataContainer interface in HNSW, currently with an explicit DataBlocksContainer implementation, and move the abstract vectors member to the base class.
This includes:
- Moving the relevant serialization part (save/restore) of the vectors in HNSW into the
DataBlocksContainerresponsibility, as we should not access the blocks directly anymore (should be applied for the graph data blocks later on as well). - Prove HNSW parallel tests (both unit and flow) to handle resizing of the vector blocks. The scenario of running parallel insertion to HNSW not via tiered index is not possible today in production, though we do have such functional tests. Until today, we never encountered a case where we are performing unsafe resizing while other operations are being done in parallel since we were reserving enough blocks in advance (which is not the case anymore). To make these tests safe, we are now using a global r/w lock in the tests to protect the index in case of resizing (similar to the mechanism in the tiered index).
Mark if applicable
- [ ] This PR introduces API changes
- [ ] This PR introduces serialization changes