chroma
chroma copied to clipboard
[ENH]: (Rust client): add EF config
Description of changes
Main changes:
- There are now separate sparse and dense EF traits.
- EF traits have a config GAT which must implement
TryInto<EmbeddingFunctionConfiguration>.
This does not introduce any machinery to automatically persist/hydrate EFs--currently, users must call get_config() / build_from_config() themselves (or .try_into() on the config). We can add that as a follow-up later but it's pretty non-trivial to build:
- must implement a registry system to map EF names to implementations, allowing third-party crates to register custom EFs
- need to remove GATs or type erase EFs so we can store refs to generic EFs during hydration
- signatures of methods like
query()will have to change which is a significant breaking change
Test plan
How are these changes tested?
- [x] Tests pass locally with
pytestfor python,yarn testfor js,cargo testfor rust
Migration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
Observability plan
What is the plan to instrument and monitor this change?
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?
Reviewer Checklist
Please leverage this checklist to ensure your code review is thorough before approving
Testing, Bugs, Errors, Logs, Documentation
- [ ] Can you think of any use case in which the code does not behave as intended? Have they been tested?
- [ ] Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
- [ ] If appropriate, are there adequate property based tests?
- [ ] If appropriate, are there adequate unit tests?
- [ ] Should any logging, debugging, tracing information be added or removed?
- [ ] Are error messages user-friendly?
- [ ] Have all documentation changes needed been made?
- [ ] Have all non-obvious changes been commented?
System Compatibility
- [ ] Are there any potential impacts on other parts of the system or backward compatibility?
- [ ] Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?
Quality
- [ ] Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)
Embed-Function Config GATs, Dense/Sparse Traits & Collection Helpers
Introduces explicit DenseEmbeddingFunction and SparseEmbeddingFunction traits with a config GAT, enabling round-trip serialise / deserialise of embedding-function parameters. Built-in BM25 and Ollama implementations are updated to comply, including config structs, TryFrom ⇄ TryInto<EmbeddingFunctionConfiguration> bridges, and helper builders. Supporting refactors touch collection/client helpers, schema validation and several utility impls (Key: AsRef<str>).
Key Changes
• Split generic EmbeddingFunction into DenseEmbeddingFunction and SparseEmbeddingFunction in rust/chroma/src/embed/mod.rs with new required fns build_from_config, get_config, get_name.
• Added config structs BM25Config and OllamaEmbeddingFunctionConfig + TryFrom → EmbeddingFunctionConfiguration implementations for persistence.
• Refactored BM25SparseEmbeddingFunction and OllamaEmbeddingFunction to GAT-based config, added error variants for (de)serialisation, simplified encode path (BM25 now infallible).
• Client/collection QoL: centralised ChromaCollection::new, replaced ad-hoc struct literal construction; ChromaHttpClient list/create now use that helper.
• Execution layer: added duplicate impl AsRef<str> for Key (and used in frontend key validation).
• Frontend fix: schema validation now calls key.as_ref() instead of allocating to_string().
• Misc clean-ups: hard-coded timeouts noted, TODOs added, minor doc / test updates.
Affected Areas
• rust/chroma/src/embed/* (traits + implementations) • rust/chroma/src/client/chroma_http_client.rs • rust/chroma/src/collection.rs • rust/types/src/execution/operator.rs • rust/frontend/src/impls/service_based_frontend.rs
This summary was automatically generated by @propel-code-bot
Closing this out as it was not landed. Can pick up as needed.