CNDB-13952: Handle Chronicle Map entry overflow in vector index compaction
What is the issue
Fixes: https://github.com/riptano/cndb/issues/13952
What does this PR fix and why was it fixed
We were hitting the entry size limit in some cases (where there were an excessive number of duplicates). This code handles that exception by attempting to reduce the size required for storing those duplicates by writing the dupes as varints instead of plain integers.
Note that most cases have only a handful of duplicated vectors per graph, so we do not optimize for this large number of dupe case. Further, chronicle map allocates a minimum chunk for an entry, and we are often under that size, so there is no reason to only write the ints as varints.
Checklist before you submit for review
- [ ] Make sure there is a PR in the CNDB project updating the Converged Cassandra version
- [ ] Use
NoSpamLoggerfor log lines that may appear frequently in the logs - [ ] Verify test results on Butler
- [ ] Test coverage for new/modified code is > 80%
- [ ] Proper code formatting
- [ ] Proper title for each commit staring with the project-issue number, like CNDB-1234
- [ ] Each commit has a meaningful description
- [ ] Each commit is not very long and contains related changes
- [ ] Renames, moves and reformatting are in distinct commits
- [ ] All new files should contain the DataStax copyright header instead of the Apache License one
@eolivelli - this is ready for another review, please take a look
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
83.5% Coverage on New Code
0.0% Duplication on New Code
:heavy_check_mark: Build ds-cassandra-pr-gate/PR-1731 approved by Butler
Approved by Butler See build details here