chroma
chroma copied to clipboard
Switch to ONNX model for default embedding model
Description of changes
Summarize the changes made by this PR.
- Improvements & Bug fixes
- None
- New functionality
- Adds a ONNX port of sentence-transformers all-MiniLM-L6-v2 in order to remove the dependencies on pytorch, sentence-transformers, sentence-piece and other heavy depdencies. This reduced the on disk environment size needed for dependencies to run chroma from ~900MB to ~300MB
- The ONNX port and verification of its accuracy live in https://github.com/chroma-core/onnx-embedding
- The ONNX model is hosted on S3 after being generated in the above repo
- The implementation here runs the model and applies mean-pooling using numpy since that's the final layer.
- The embedding model will download the model using tqdm to provide the same download experience as before
- If the model is cached it will not be downloaded.
- In contrast to before, the model is now ONLY downloaded if used!
- The net-new dependencies are onnxruntime and tokenizers, both are lightweight.
- Updated the default model to be this one instead of ST.
- We create a new DefaultEmbeddingFunction which aliases the ONNX embedding function
Test plan
Added a test to test multiple batches with the new model
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository? We will need to change the documentation here https://docs.trychroma.com/embeddings#default-sentence-transformers to highlight this fact. This PR will not merge until that documentation is live.
Chroma-core/onnx-embedding’s “compare_onnx.py” runs two benchmarks to evaluate that the two models are equivalent. Would you prefer I turn it into a test? I didn’t anticipate us ever touching this again so left it hacky.
That should be create_onnx - will fix thanks!
Docs added - https://github.com/chroma-core/docs/pull/65