feat: Implement KVConfig multi-backend architecture
- Add LMDBConfig class with comprehensive configuration options
- Update kv_get, kv_batch_get, kv_exists functions to accept KVConfig parameter
- Add backend validation and error handling for multiple backends
- Update test cases to use new KVConfig architecture
- Maintain backward compatibility with legacy URI parameters
- Provide factory methods for easy configuration creation
This implements the multi-KV backend architecture discussed in PR feedback, enabling support for different storage backends (Lance for AI/ML workloads, LMDB for high-performance caching) through a unified configuration interface.
Changes Made
Related Issues
Checklist
- [ ] Documented in API Docs (if applicable)
- [ ] Documented in User Guide (if applicable)
- [ ] If adding a new documentation page, doc is added to
docs/mkdocs.ymlnavigation - [ ] Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)
Codecov Report
:x: Patch coverage is 34.90832% with 923 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 71.80%. Comparing base (b9ab62b) to head (efedba4).
:warning: Report is 10 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #5170 +/- ##
==========================================
+ Coverage 71.57% 71.80% +0.22%
==========================================
Files 960 971 +11
Lines 124203 125179 +976
==========================================
+ Hits 88897 89881 +984
+ Misses 35306 35298 -8
| Files with missing lines | Coverage Δ | |
|---|---|---|
| daft/__init__.py | 86.95% <100.00%> (+0.28%) |
:arrow_up: |
| daft/functions/__init__.py | 100.00% <100.00%> (ø) |
|
| src/daft-dsl/src/functions/kv/registry.rs | 100.00% <100.00%> (ø) |
|
| src/daft-dsl/src/functions/mod.rs | 89.01% <ø> (ø) |
|
| src/daft-dsl/src/lib.rs | 100.00% <100.00%> (ø) |
|
| ...ft-local-execution/src/intermediate_ops/project.rs | 97.36% <ø> (+58.25%) |
:arrow_up: |
| src/daft-parquet/src/stream_reader.rs | 88.99% <100.00%> (+0.12%) |
:arrow_up: |
| src/daft-session/src/options.rs | 66.66% <ø> (ø) |
|
| src/lib.rs | 97.34% <100.00%> (+0.02%) |
:arrow_up: |
| src/daft-writers/src/parquet_writer.rs | 91.50% <90.00%> (-0.16%) |
:arrow_down: |
| ... and 13 more |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@rchowell I fixed your comment. Please take a look when you have time.
@Jay-ju, It looks like @rchowell approved 🎉. Unfortunately it appears there are now some merge conflicts. If you could get those resolved, we could get this merged and included into the next release.
@Jay-ju - Thanks for the work on this. Checking to see if you are able to push this through.
@universalmind303 @madvart
- Thank you very much for your patience. I have fixed the conflicts.
- However, some CI processes are currently failing. After checking, I think it's not an issue with my PR. I see that all of them are reporting errors like this, and I feel that it should be a PR that affects the stability of the ci which has been merged back into the main branch.
- Please review the content of the PR again.
@Jay-ju I'm seeing quite a few errors in the unit tests that seem to be from some unwrapping of pyresult's
E daft.exceptions.DaftCoreException: DaftError::External task 690926 panicked with message "called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'daft.exceptions.DaftCoreException'>, value: DaftCoreException('DaftError::CatalogError KV Store with name embeddings_1765267090922382 not found!'), traceback: None }"
@universalmind303 Please help review the changes here when you have time?