chroma
chroma copied to clipboard
Remove sentence-transformers as a hard requirement
Currently we use sentence-transformers
as the default embedding model. However this means that it, and a lot of it's deps are included in the project. Additionally it downloads the model on start-up, which hurts startup time. Furthermore it makes Chroma not installable on certain envs, like Python 3.11.
Will close
- https://github.com/chroma-core/chroma/issues/163
Is there any workaround for using chromadb with python 3.11x? I have a VScode environment that is working well and I don't want to mess with it (still a newbie). I have been writing text-based AI code using chromadb in Colab but there are local modes like the microphone and speaker that I need to use.
@Tanzengeist we are prioritizing this and will followup later today
@jeffchuber Eagerly waiting for the solution. In the meantime, what alternative you recommend so I can use chromadb in my codebase?
Jeff, I’m sure your all working hard on this. When you have a workaround, please send up a flare.
#267 removes sentence-transformers, but unfortunately will still not unblock 3.11 as onnxruntime does not yet support it. With major packages like onnx and pytorch not supporting 3.11, it is hard for us to deliver models to users and support 3.11 until these dependencies do :(
Works fine with: ARCHFLAGS="-arch x86_64" pip install chromadb
See if that's any useful.
Reference: https://github.com/Yale-LILY/SummerTime/issues/116#issuecomment-984134322
Any updates on removing sentence-transformers as a hard requirement?
Hi ! I'm interested in this solution. Do we have a workaround before this is released ?
Hi, the project seems not hard dependent on sentence-transformers
, will this dependency be removed in the requirements?
@specter119 yes in two ways.
- the default bundling will be switched to the trimmed down ONNX model https://github.com/chroma-core/chroma/pull/267
- we will ship a client-only build of chroma as a separate pypi project
both very soon
@jeffchuber thx, sentence-transformers
brings a heavy dependency, which causes the Conda build not pass.
- https://github.com/conda-forge/chromadb-feedstock/pull/6
- Failed build
BTW, will the vector storage related features in LangChain are dependent on both server and client of chroma?
Good to know, im glad we are removing that.
Langchain by default uses the in-memory version of chroma which is more of a library than a client or a server.
chroma-client
fixed this. https://pypi.org/project/chromadb-client/ I think for most users