chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[Feature Request]: Fallback to `get` if no `query`

Open jeffchuber opened this issue 1 year ago • 0 comments

Describe the problem

from a DM with @vincentteyssier

An auto retriever is basically calling chatgpt and getting a structured output back that will help form the query towards Chroma:
{"query": "", "filters": [{"key": "location", "value": "Singapore"}], "top_k": 2}
However there are instances where the query string returned by chatgpt is empty.
Out of the box this produces an error as you have validation in your chroma package that query_text and query_embedding cannot be null at the same time for the query method.
To achieve the result desired, one must use the get method instead of the query.
It's not difficult to test for query being empty and select the right method, but in frameworks like langchain or llamaindex, given the imbrication of classes, it becomes quite tricky.
If one could instead use the query method with a wildcard query_text, that would be much simpler than determining which method to use. Especially that get method doesn't support n_result limiters. Something like that would be much better to use:
results = self._collection.query(
            query_texts='*',
            n_results=query.similarity_top_k,
            where=where,
            **kwargs,
        )

Describe the proposed solution

It's an interesting idea that if there are no query vectors passed - it would fall back to get(). I don't think we should do this soon soon, but this could be an interesting query pre/post processing pipeline addition.

Alternatives considered

No response

Importance

nice to have

Additional Information

No response

jeffchuber avatar Nov 08 '23 16:11 jeffchuber