langchain icon indicating copy to clipboard operation
langchain copied to clipboard

[bugfix] Fix persisted chromadb vectorstore

Open timothyasp opened this issue 2 years ago • 5 comments

If a persist_directory param was set, chromadb would throw a warning that ""No embedding_function provided, using default embedding function: SentenceTransformerEmbeddingFunction". and would error with a Illegal instruction: 4 error.

This is on a MBP M1 13.2.1, python 3.9.

I'm not entirely sure why that error happened, but when using get_or_create_collection instead of list_collection on our end, the error and warning goes away and chroma works as expected.

Added bonus this is cleaner and likely more efficient. list_collections builds a new Collection instance for each collect, then Chroma would just use the name field to tell if the collection existed.

timothyasp avatar Mar 05 '23 04:03 timothyasp

@timothyasp theres been lots of dev on chroma side - is this error still happening with recent versions?

hwchase17 avatar Mar 09 '23 04:03 hwchase17

I'm seeing this right now with chromadb==0.3.11 and langchain==0.0.105 So yes, I think it's still happening. And I'm using the persist_directory parameter.

Taytay avatar Mar 09 '23 06:03 Taytay

Yea, I believe so, I’ve been using this patched code now and been pulling pretty much daily from both upstreams, and things have been working.

On Wed, Mar 8, 2023 at 8:17 PM Harrison Chase @.***> wrote:

@timothyasp https://github.com/timothyasp theres been lots of dev on chroma side - is this error still happening with recent versions?

— Reply to this email directly, view it on GitHub https://github.com/hwchase17/langchain/pull/1444#issuecomment-1461255453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFMY4ZX62S45XEIVH6VZZTW3FKV5ANCNFSM6AAAAAAVP57MYA . You are receiving this because you were mentioned.Message ID: @.***>

-- -Tim

timothyasp avatar Mar 09 '23 17:03 timothyasp

@hwchase17 any chance we can get this merged into a patch release?

KMontag42 avatar Mar 09 '23 17:03 KMontag42

tried the fix looks good to me looking forward for a merge in a new release please, this issue is annoying

lambda-science avatar Mar 10 '23 00:03 lambda-science

@hwchase17 can we get this merged? It seems to be impacting a lot of people

timothyasp avatar Mar 10 '23 19:03 timothyasp

In the meantime I forked the original repos and applied the patch (to be up to date with latest commit). You can switch your langchain by langchain@git+https://github.com/lambda-science/langchain.git in a simple requirements.txt. Not optimal at all for now, but it does the job

lambda-science avatar Mar 10 '23 20:03 lambda-science

merging in! sorry for delay!

hwchase17 avatar Mar 10 '23 23:03 hwchase17

@hwchase17 I am still getting this error on version 0.0.146

jpiabrantes avatar Apr 21 '23 12:04 jpiabrantes

@hwchase17 Can confirm this is still an issue in 0.0.161

MelchiSalins avatar May 08 '23 09:05 MelchiSalins

@hwchase17 This issue is still there in 0.0.171

DungMinhDao avatar May 17 '23 16:05 DungMinhDao

Do we know if this error has been resolved I am getting an error message as follows: "You must provide embeddings or a function to compute them". How do I patch my current vectorstore chromadb file?

Rish111 avatar Jun 06 '23 10:06 Rish111