solr icon indicating copy to clipboard operation
solr copied to clipboard

SOLR-17780: Add support for scalar quantized dense vectors

Open liangkaiwen opened this issue 6 months ago • 1 comments

https://issues.apache.org/jira/browse/SOLR-17780

Description

Lucene 9.9 codec includes an implementation of scalar quantized dense vectors. This is not yet supported in Solr.

Solution

  • Add a new schema type ScalarQuantizedDenseVectorField
  • Add usage of Lucene99ScalarQuantizedVectorsFormat / Lucene99HnswScalarQuantizedVectorsFormat in SchemaCodecFactory. This incorporates the respective index reader/writers

Tests

TODO

Checklist

Please review the following and check all that apply:

  • [X] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [X] I have created a Jira issue and added the issue ID to my pull request title.
  • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • [X] I have developed this patch against the main branch.
  • [ ] I have run ./gradlew check.
  • [ ] I have added tests for my changes.
  • [ ] I have added documentation for the Reference Guide

liangkaiwen avatar Jun 11 '25 21:06 liangkaiwen

I still need to add tests and update documentation

liangkaiwen avatar Jun 11 '25 21:06 liangkaiwen

Hi @liangkaiwen , I rebased, added CHANGES.txt, and now tests are broken. I don't have bandwidth now to investigate, can you? Let me know under the comments, I'm happy to finalise this and merge it soon!

alessandrobenedetti avatar Aug 29 '25 10:08 alessandrobenedetti

Just a quick comment: next time please don't rebase a PR; simply merge main into it. I've found rebasing PRs to be a bad practice as it resets the GitHub review state, making it impossible to be certain what incremental changes happened (if any).

dsmiley avatar Aug 29 '25 13:08 dsmiley

Just a quick comment: next time please don't rebase a PR; simply merge main into it. I've found rebasing PRs to be a bad practice as it resets the GitHub review state, making it impossible to be certain what incremental changes happened (if any).

Wasn't it mandatory to comply with Crave checks? That's the only reason I (and my colleagues at least) are doing it, has anything changed?

alessandrobenedetti avatar Sep 01 '25 20:09 alessandrobenedetti

I don't recall Crave being confused by a merge from "main". I do recall once in a while (rare-ish), GitHub PR would show changes from the merge that it shouldn't.

dsmiley avatar Sep 02 '25 01:09 dsmiley

@alessandrobenedetti updated + fixed tests

liangkaiwen avatar Sep 02 '25 16:09 liangkaiwen

Checks and tests are still failing @liangkaiwen , I'll give it a try now, If I fix them in 1 hour I'll merge, if not I'll ask you to give a double check, keep you posted!

alessandrobenedetti avatar Sep 03 '25 14:09 alessandrobenedetti

I fixed the tidy problem, but the tests (the ones expected to throw exceptions) are still failing. I gave it a very quick try, with debug, but it doesn't look obvious to me, as I'm not familiar with that type of test with an expected error string, so in case you can't manage to fix them, I'll give it a try, spending a bit more time. Let me know!

alessandrobenedetti avatar Sep 03 '25 14:09 alessandrobenedetti

Huh, I thought it had passed after I committed my last changes but I guess not. No problem, I will handle it

liangkaiwen avatar Sep 03 '25 18:09 liangkaiwen