langchain icon indicating copy to clipboard operation
langchain copied to clipboard

SelfQueryRetriever: invalid operators/comparators

Open peter-brady opened this issue 2 years ago • 2 comments

Hi,

I've been playing with the SelfQueryRetriever examples but am having a few issues with allowed operators and valid Comparator/s.

Example 2: This example only specifies a filter retriever.get_relevant_documents("I want to watch a movie rated higher than 8.5")

This build the following query: query=' ' filter=Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5)

But it fails with: HTTP response body: {"code":3,"message":"$Comparator.GT is not a valid operator","details":[]}

For other examples, I see: HTTP response body: {"code":3,"message":"only logical operators as $or and $and are allowed at top level, got $Operator.AND","details":[]}

I'm using the following on MacOS:

  • Python 3.11.3
  • langchain 0.0.152
  • lark 1.1.5

With a Pinecone index:

  • Environment: us-central1-gcp
  • Metric: cosine
  • Pod Type: p1.x1
  • Dimensions: 1536

This will be a killer search feature so, I'd be very grateful if anybody is able to shed some light on this

Thanks.

peter-brady avatar Apr 29 '23 14:04 peter-brady

hm seems like an enum isn't properly being converted to a string. i can't reproduce locally with python 3.9, wonder if enum str conversion has changed in newer versions. taking a closer look now

dev2049 avatar May 01 '23 16:05 dev2049

able to recreate with python 3.11

dev2049 avatar May 01 '23 16:05 dev2049

#3892 should fix but let me know if you still see issues (after next release)

dev2049 avatar May 01 '23 22:05 dev2049

I am using [email protected]. I am still see the same issue

sachins-eng avatar Sep 26 '23 03:09 sachins-eng

Following up as well using langchain 0.0.320. Since I do not have access to openai I am using meta-llama/Llama-2-13b-chat-hf image

analyticanna avatar Oct 23 '23 17:10 analyticanna

hi i m seeing similar problem for chroma vectorstore. please help. with this.

ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]

kuabhish avatar Feb 05 '24 12:02 kuabhish

hi i m seeing similar problem for chroma vectorstore. please help. with this.

ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]

Have you managed to solve this issue? How did you overcome it?

ikitaev avatar Feb 20 '24 12:02 ikitaev

I'm having the same issue in ChromaDB, anyone managed to fix this?

ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]

EdIzaguirre avatar Mar 21 '24 23:03 EdIzaguirre

I'm having the same issue in ChromaDB, anyone managed to fix this?

ValueError: Received disallowed comparator Comparator.IN. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]

I started using elasticsearch that supports Comparator.IN

ikitaev avatar Mar 22 '24 02:03 ikitaev

Right, saw that ChromaDB doesn't allow the IN operator. For those who want to use ChromaDB regardless, you can still get the self-query to work by manually constructing your query constructor prompt. Instead of using SelfQueryRetriever.from_llm to construct your retriever, use the following code (you can find similar info here).

from langchain.chains.query_constructor.base import (
    StructuredQueryOutputParser,
    get_query_constructor_prompt,
)

document_content_description = "Brief summary of a movie"

# Define allowed comparators list
allowed_comparators = [
    "$eq",  # Equal to (number, string, boolean)
    "$ne",  # Not equal to (number, string, boolean)
    "$gt",  # Greater than (number)
    "$gte",  # Greater than or equal to (number)
    "$lt",  # Less than (number)
    "$lte",  # Less than or equal to (number)
]

constructor_prompt = get_query_constructor_prompt(
    document_content_description,
    metadata_field_info,
    allowed_comparators=allowed_comparators,
)


query_model = ChatOpenAI(
    # model='gpt-3.5-turbo-0125',
    model='gpt-4-0125-preview',
    temperature=0,
    streaming=True,
)

output_parser = StructuredQueryOutputParser.from_components()
query_constructor = constructor_prompt | query_model | output_parser

from langchain.retrievers.self_query.chroma import ChromaTranslator

retriever = SelfQueryRetriever(
    query_constructor=query_constructor,
    vectorstore=vectorstore,
    structured_query_translator=ChromaTranslator(),
)

Seems to me that the documentation should be updated to reflect the importance of the allowed comparators for different vector stores.

EdIzaguirre avatar Mar 22 '24 18:03 EdIzaguirre

Just add this

"Do not use the word 'contains' or 'contain' as filters." 

in the document_content_description

bilaliba avatar May 07 '24 04:05 bilaliba

In my case, just add this "Do not use the words 'IN' or 'in' or 'Not IN' as filters." in the document_content_description. It works like a charm!

swiftwind16 avatar Jul 23 '24 03:07 swiftwind16

contain also doesn't work

ValueError: Received disallowed comparator contain. Allowed comparators are [<Comparator.EQ: 'eq'>, <Comparator.NE: 'ne'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT: 'lt'>, <Comparator.LTE: 'lte'>]

codenamics avatar Aug 05 '24 21:08 codenamics