chroma
chroma copied to clipboard
[Bug]: metadata filter does not work over 20 millions chunk.
What happened?
I have more than 20 millions records on the chromadb. The chroma is deployed as a server on docker.
While executing below query
query_params = {"query_embeddings": query_embeddings, "n_results": 10} results = collection.query(**query_params) results
The results are returned quickly from chroma db. But as soon as i add where clause below where_clause = { "subscription_id": {"$in": ["690f2619-2a02-4a2f-a09a-69544239455e"]} } query_params = {"query_embeddings": query_embeddings, "n_results": 10} query_params["where"] = where_clause results = collection.query(**query_params) results
The code hangs and does not returned. The metadata does have this key and is able to retrieve on smaller dataset. But for dataset more than 20 millions it does not return.
Versions
Aws linux - t2.2xlarge Docker chroma version - 0.5.20
Relevant log output