chroma
chroma copied to clipboard
[Feature Request]: Fallback to `get` if no `query`
Describe the problem
from a DM with @vincentteyssier
An auto retriever is basically calling chatgpt and getting a structured output back that will help form the query towards Chroma:
{"query": "", "filters": [{"key": "location", "value": "Singapore"}], "top_k": 2}
However there are instances where the query string returned by chatgpt is empty.
Out of the box this produces an error as you have validation in your chroma package that query_text and query_embedding cannot be null at the same time for the query method.
To achieve the result desired, one must use the get method instead of the query.
It's not difficult to test for query being empty and select the right method, but in frameworks like langchain or llamaindex, given the imbrication of classes, it becomes quite tricky.
If one could instead use the query method with a wildcard query_text, that would be much simpler than determining which method to use. Especially that get method doesn't support n_result limiters. Something like that would be much better to use:
results = self._collection.query(
query_texts='*',
n_results=query.similarity_top_k,
where=where,
**kwargs,
)
Describe the proposed solution
It's an interesting idea that if there are no query vectors passed - it would fall back to get()
. I don't think we should do this soon soon, but this could be an interesting query pre/post processing pipeline addition.
Alternatives considered
No response
Importance
nice to have
Additional Information
No response