langchain RetrievalQA.from_chain_type‘s parameter retriever can not use ContextualCompressionRetriever

System Info

Based on the official doc, I created two type of retriever:

faiss_retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()serving as aVectorStoreRetriever(referenced from the API doc)
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=retriever)functioning as aContextualCompressionRetriever （also referenced from the API doc）

Then I ran the RetrievalQA to get relative content by Chain by code below:

qa = RetrievalQA.from_chain_type(llm=OpenAI( verbose=True), chain_type="stuff", retriever=compression_retriever,return_source_documents=True, chain_type_kwargs=chain_type_kwargs)
# or
qa = RetrievalQA.from_chain_type(llm=OpenAI( verbose=True), chain_type="stuff", retriever=faiss_retriever,return_source_documents=True, chain_type_kwargs=chain_type_kwargs)

The result was that the qa with compression_retriever failed to return context for the prompt(return with empty array), whereas the qa with faiss_retriever successfully returned the context.

Who can help?

@hwchase17 @eyurtsev

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[X] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

loader = UnstructuredFileLoader("NER.txt")
document = loader.load()
separators = ["。", " "]
text_splitter = RecursiveCharacterTextSplitter(separators=separators, chunk_size=500, chunk_overlap=0)
texts = text_splitter.split_documents(document)
embeddings = OpenAIEmbeddings()
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.81)
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever() # base retriever
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=retriever) # document compression retriver

from langchain.prompts import PromptTemplate
prompt_template1 = """plase use context to answer question.

{context}

question: {question}
anwser:"""
PROMPT = PromptTemplate(
    template=prompt_template1, input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": PROMPT,'verbose': True}
qa = RetrievalQA.from_chain_type(llm=OpenAI( verbose=True), chain_type="stuff", retriever=compression_retriever,return_source_documents=True, chain_type_kwargs=chain_type_kwargs)

query = "balabalabala" # replace it with question
result = qa({"query": query})
print(result)```

### Expected behavior

While using `ContextualCompressionRetriever` for `RetrievalQA` could output a not-null context., aka not [].

Jul 05 '23 02:07 sawyerbutton

Answer generated by a 🤖

Answer

I understand that you're experiencing an issue where the ContextualCompressionRetriever is returning an empty array when used with the RetrievalQA.from_chain_type method. This is happening despite the fact that when you use FAISS as a VectorStoreRetriever, you're able to retrieve the context successfully.

Based on the information you've provided, it seems that the ContextualCompressionRetriever is not able to find any documents that meet the similarity_threshold of 0.81 you've set in the EmbeddingsFilter. This could be due to the specific texts and query in your dataset.

I suggest experimenting with different similarity_threshold values and different base retrievers. For instance, you could try using a lower similarity_threshold value, like 0.75, and see if that returns any results. Here's how you can modify your EmbeddingsFilter and base retriever:

# Using FAISS as base retriever
base_retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()

# Using a lower similarity_threshold
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.75)

compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=base_retriever)

Please note that the similarity_threshold and the choice of base retriever are hyperparameters that you might need to tune based on your specific use case and dataset.

If you find that this solution works for you and you believe it's a bug that could impact other users, you're welcome to make a pull request with your findings. This could be a valuable contribution to the LangChain community.

Sources

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

Jul 05 '23 11:07 dosubot[bot]

Answer generated by a 🤖

Answer

I understand that you're experiencing an issue where the ContextualCompressionRetriever is returning an empty array when used with the RetrievalQA.from_chain_type method. This is happening despite the fact that when you use FAISS as a VectorStoreRetriever, you're able to retrieve the context successfully.

Based on the information you've provided, it seems that the ContextualCompressionRetriever is not able to find any documents that meet the similarity_threshold of 0.81 you've set in the EmbeddingsFilter. This could be due to the specific texts and query in your dataset.

I suggest experimenting with different similarity_threshold values and different base retrievers. For instance, you could try using a lower similarity_threshold value, like 0.75, and see if that returns any results. Here's how you can modify your EmbeddingsFilter and base retriever:
# Using FAISS as base retriever
base_retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()

# Using a lower similarity_threshold
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.75)

compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=base_retriever)
Please note that the similarity_threshold and the choice of base retriever are hyperparameters that you might need to tune based on your specific use case and dataset.

If you find that this solution works for you and you believe it's a bug that could impact other users, you're welcome to make a pull request with your findings. This could be a valuable contribution to the LangChain community.

Sources

docs/snippets/modules/data_connection/retrievers/contextual_compression/get_started.mdx

tests/integration_tests/retrievers/test_contextual_compression.py

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

Thanks bot, But this way does not work.

I have tried to use similarity_threshold = 0.81 to retrieve content by directly using code below, and got a not null list as a result.

retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=retriever)
compressed_docs = compression_retriever.get_relevant_documents("replace question with your own")
pretty_print_docs(compressed_docs)

So, Any idea?

Jul 05 '23 14:07 sawyerbutton

So, Anyone has faced this issue? Tried to fork the code and made some re-implementation but got a fail. Need some help here.

Jul 06 '23 14:07 sawyerbutton

Hey I've hadn't face this issue yet haha but my goal was to reduce the amount of documents retrieved by using compressions tried one of the first examples which used LLMChainExtractor and it works well it returned the exact document out of 6 but adding that to the whole process increase the number of tokens used might as well didn't used compression in the first place, but I think it depends on the documents or your index though.

Here's an overview of my code below

exisiting_index_1 = Pinecone.from_existing_index(index_name, embeddings, namespace='random-namespace')

retriever_1 = exisiting_index_1.as_retriever(search_kwargs={"k": 6})

compressor_1 = LLMChainExtractor.from_llm(llm_standard)
compression_retriever_1 = ContextualCompressionRetriever(base_compressor=compressor_1, base_retriever=retriever_1)
ss_general_info = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=compression_retriever_1, return_source_documents=True

Jul 10 '23 05:07 itsjustmeemman

Tried using the embeddings_filter

embeddings_filter_1 = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
compression_retriever_1 = ContextualCompressionRetriever(base_compressor=embeddings_filter_1, base_retriever=retriever_1)

It gave me an error of InvalidRequestError: This model's maximum context length is 16385 tokens. However, your messages resulted in 65086 tokens. Please reduce the length of the messages. 🤣

Jul 10 '23 05:07 itsjustmeemman

Hey I've hadn't face this issue yet haha but my goal was to reduce the amount of documents retrieved by using compressions tried one of the first examples which used LLMChainExtractor and it works well it returned the exact document out of 6 but adding that to the whole process increase the number of tokens used might as well didn't used compression in the first place, but I think it depends on the documents or your index though.

Here's an overview of my code below
exisiting_index_1 = Pinecone.from_existing_index(index_name, embeddings, namespace='random-namespace')

retriever_1 = exisiting_index_1.as_retriever(search_kwargs={"k": 6})

compressor_1 = LLMChainExtractor.from_llm(llm_standard)
compression_retriever_1 = ContextualCompressionRetriever(base_compressor=compressor_1, base_retriever=retriever_1)
ss_general_info = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=compression_retriever_1, return_source_documents=True

I'll be trying out the solution you've suggested, and I appreciate your willingness to share it. Actually, in the realm of software development, it's commonplace to use multiple solutions to achieve a single goal. However, within a tool-library, such a practice should ideally not occur. Confusing🤣.

Jul 10 '23 09:07 sawyerbutton

Hi, @sawyerbutton! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding of the issue, the problem is that the ContextualCompressionRetriever is not returning any context when used with the RetrievalQA.from_chain_type() method, while the FAISS retriever successfully returns the context. There have been some suggestions in the comments, such as experimenting with different similarity_threshold values and different base retrievers. Another user shared their code using LLMChainExtractor and mentioned that adding compression increased the number of tokens used.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

Oct 09 '23 16:10 dosubot[bot]

langchain langchain copied to clipboard

RetrievalQA.from_chain_type‘s parameter retriever can not use ContextualCompressionRetriever

System Info

Who can help?

Information

Related Components

Reproduction

Answer

Sources

Answer

Sources

langchain
langchain copied to clipboard