langchain
langchain copied to clipboard
SelfQueryRetriever: invalid operators/comparators
Hi,
I've been playing with the SelfQueryRetriever examples but am having a few issues with allowed operators and valid Comparator/s.
Example 2: This example only specifies a filter
retriever.get_relevant_documents("I want to watch a movie rated higher than 8.5")
This build the following query:
query=' ' filter=Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5)
But it fails with:
HTTP response body: {"code":3,"message":"$Comparator.GT is not a valid operator","details":[]}
For other examples, I see:
HTTP response body: {"code":3,"message":"only logical operators as $or and $and are allowed at top level, got $Operator.AND","details":[]}
I'm using the following on MacOS:
- Python 3.11.3
- langchain 0.0.152
- lark 1.1.5
With a Pinecone index:
- Environment: us-central1-gcp
- Metric: cosine
- Pod Type: p1.x1
- Dimensions: 1536
This will be a killer search feature so, I'd be very grateful if anybody is able to shed some light on this
Thanks.
hm seems like an enum isn't properly being converted to a string. i can't reproduce locally with python 3.9, wonder if enum str conversion has changed in newer versions. taking a closer look now
able to recreate with python 3.11
#3892 should fix but let me know if you still see issues (after next release)
I am using [email protected]. I am still see the same issue
Following up as well using langchain 0.0.320. Since I do not have access to openai I am using meta-llama/Llama-2-13b-chat-hf
hi i m seeing similar problem for chroma vectorstore. please help. with this.
ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]
hi i m seeing similar problem for chroma vectorstore. please help. with this.
ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]
Have you managed to solve this issue? How did you overcome it?
I'm having the same issue in ChromaDB, anyone managed to fix this?
ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]
I'm having the same issue in ChromaDB, anyone managed to fix this?
ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]
I started using elasticsearch that supports Comparator.IN
Right, saw that ChromaDB doesn't allow the IN operator. For those who want to use ChromaDB regardless, you can still get the self-query to work by manually constructing your query constructor prompt. Instead of using SelfQueryRetriever.from_llm to construct your retriever, use the following code (you can find similar info here).
from langchain.chains.query_constructor.base import (
StructuredQueryOutputParser,
get_query_constructor_prompt,
)
document_content_description = "Brief summary of a movie"
# Define allowed comparators list
allowed_comparators = [
"$eq", # Equal to (number, string, boolean)
"$ne", # Not equal to (number, string, boolean)
"$gt", # Greater than (number)
"$gte", # Greater than or equal to (number)
"$lt", # Less than (number)
"$lte", # Less than or equal to (number)
]
constructor_prompt = get_query_constructor_prompt(
document_content_description,
metadata_field_info,
allowed_comparators=allowed_comparators,
)
query_model = ChatOpenAI(
# model='gpt-3.5-turbo-0125',
model='gpt-4-0125-preview',
temperature=0,
streaming=True,
)
output_parser = StructuredQueryOutputParser.from_components()
query_constructor = constructor_prompt | query_model | output_parser
from langchain.retrievers.self_query.chroma import ChromaTranslator
retriever = SelfQueryRetriever(
query_constructor=query_constructor,
vectorstore=vectorstore,
structured_query_translator=ChromaTranslator(),
)
Seems to me that the documentation should be updated to reflect the importance of the allowed comparators for different vector stores.
Just add this
"Do not use the word 'contains' or 'contain' as filters."
in the document_content_description
In my case, just add this
"Do not use the words 'IN' or 'in' or 'Not IN' as filters."
in the document_content_description. It works like a charm!
contain also doesn't work
ValueError: Received disallowed comparator contain. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]