[Bug]: {"error":"RuntimeError('Label not found')"}
What happened?
I am retrieving data from my chroma collection, including embeddings, metadatas, and documents. It shows this error. I don't know what is this error, I already have this collection on version 0.4.22 as a docker volume. Now I spawned a different docker container (version 0.5.4) using the same docker volume
Versions
Chroma 0.5.4, Chroma 0.4.22
Relevant log output
No response
For clarification, there is only one docker container (v0.5.4) with docker volume created with chroma 0.4.22. I have around 10-15 collections, I am able to fetch it for the rest of the collection, only some collection are throwing this error.
I want a workaround, as I don't have access to the docs ingested, and the embedding model.
@AlokRanjanSwain, that is an interesting bug. What the error tells me is that your binary index does not agree with the contents of its metadata. I would be curious to find out how that happened, but in the interest time I'll share a way for you to work around this error which involves rebuilding your collection from the WAL.
- ⚠️Start by backing up your Chroma persistent dir
- Find out which is the binary segment subdir in the persistent dir: Connect to your container
apt update && apt install sqlite3
sqlite3 /chroma/chroma/chroma.sqlite3 "select s.id from collections c left join segments s on c.id = s.collection where c.name='<your_collection_name>' and s.scope='VECTOR';"
Pick the UUID that is printed out from the above and exit the shell
3. Stop Chroma
4. Navigate to your chroma persistent dir (I assume that you have -v ./local_chromadata:/chroma/chroma mounted dir)
5. Remove the dir with UUID name from step 2
6. Restart Chroma
7. Access your collection so that Chroma can rebuild it (collection.get(limit=1,include=["embeddings"]))
Again, emphasizing the backup part!!
Regarding the root cause of the issue, do you have any sensitive data in the faulty collection and can you share it for further debugging? It doesn't have to be here on GH; you can DM me (@taz) in Discord (https://discord.com/invite/MMeYNTmh3x)
If I remove the dir, in step 5, does not my all data (documents, embeddings, etc.) in that collection will be lost ?
The data is quite sensitive, so unable to share, sorry for that.
@AlokRanjanSwain, this is why we have step 1. Backup the whole persistent dir which will also include the UUID-name subdirs. The UUID-named subdirs hold only the HNSW index, but Chroma <=0.5.4 has a WAL (write-ahead log) which allows for all index updates to be replayed and thus the index reconstructed.
So I need to copy the deleted dir again at that persistent volume after the index creation by chroma ?
cc @tazarov
@AlokRanjanSwain did you ever resolve this?
Closing due to inactivity for some time. @AlokRanjanSwain if you still run into this with Chroma v0.6.0 or later, please feel free to open a new issue! We will need as much information as possible to reproduce the error and investigate it.