langchain
langchain copied to clipboard
RetrievalQA.from_chain_type‘s parameter retriever can not use ContextualCompressionRetriever
System Info
Based on the official doc, I created two type of retriever:
-
faiss_retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
serving as aVectorStoreRetriever
(referenced from the API doc) -
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=retriever)
functioning as aContextualCompressionRetriever
(also referenced from the API doc)
Then I ran the RetrievalQA to get relative content by Chain by code below:
qa = RetrievalQA.from_chain_type(llm=OpenAI( verbose=True), chain_type="stuff", retriever=compression_retriever,return_source_documents=True, chain_type_kwargs=chain_type_kwargs)
# or
qa = RetrievalQA.from_chain_type(llm=OpenAI( verbose=True), chain_type="stuff", retriever=faiss_retriever,return_source_documents=True, chain_type_kwargs=chain_type_kwargs)
The result was that the qa with compression_retriever failed to return context for the prompt(return with empty array), whereas the qa with faiss_retriever successfully returned the context.
Who can help?
@hwchase17 @eyurtsev
Information
- [ ] The official example notebooks/scripts
- [X] My own modified scripts
Related Components
- [ ] LLMs/Chat Models
- [ ] Embedding Models
- [ ] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [ ] Document Loaders
- [X] Vector Stores / Retrievers
- [ ] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [ ] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
loader = UnstructuredFileLoader("NER.txt")
document = loader.load()
separators = ["。", " "]
text_splitter = RecursiveCharacterTextSplitter(separators=separators, chunk_size=500, chunk_overlap=0)
texts = text_splitter.split_documents(document)
embeddings = OpenAIEmbeddings()
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.81)
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever() # base retriever
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=retriever) # document compression retriver
from langchain.prompts import PromptTemplate
prompt_template1 = """plase use context to answer question.
{context}
question: {question}
anwser:"""
PROMPT = PromptTemplate(
template=prompt_template1, input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": PROMPT,'verbose': True}
qa = RetrievalQA.from_chain_type(llm=OpenAI( verbose=True), chain_type="stuff", retriever=compression_retriever,return_source_documents=True, chain_type_kwargs=chain_type_kwargs)
query = "balabalabala" # replace it with question
result = qa({"query": query})
print(result)```
### Expected behavior
While using `ContextualCompressionRetriever` for `RetrievalQA` could output a not-null context., aka not [].
Answer generated by a 🤖
Answer
I understand that you're experiencing an issue where the ContextualCompressionRetriever
is returning an empty array when used with the RetrievalQA.from_chain_type
method. This is happening despite the fact that when you use FAISS
as a VectorStoreRetriever
, you're able to retrieve the context successfully.
Based on the information you've provided, it seems that the ContextualCompressionRetriever
is not able to find any documents that meet the similarity_threshold
of 0.81 you've set in the EmbeddingsFilter
. This could be due to the specific texts and query in your dataset.
I suggest experimenting with different similarity_threshold
values and different base retrievers. For instance, you could try using a lower similarity_threshold
value, like 0.75, and see if that returns any results. Here's how you can modify your EmbeddingsFilter
and base retriever:
# Using FAISS as base retriever
base_retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
# Using a lower similarity_threshold
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.75)
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=base_retriever)
Please note that the similarity_threshold
and the choice of base retriever are hyperparameters that you might need to tune based on your specific use case and dataset.
If you find that this solution works for you and you believe it's a bug that could impact other users, you're welcome to make a pull request with your findings. This could be a valuable contribution to the LangChain community.
Sources
- docs/snippets/modules/data_connection/retrievers/contextual_compression/get_started.mdx
- tests/integration_tests/retrievers/test_contextual_compression.py
This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
Answer generated by a 🤖
Answer
I understand that you're experiencing an issue where the
ContextualCompressionRetriever
is returning an empty array when used with theRetrievalQA.from_chain_type
method. This is happening despite the fact that when you useFAISS
as aVectorStoreRetriever
, you're able to retrieve the context successfully.Based on the information you've provided, it seems that the
ContextualCompressionRetriever
is not able to find any documents that meet thesimilarity_threshold
of 0.81 you've set in theEmbeddingsFilter
. This could be due to the specific texts and query in your dataset.I suggest experimenting with different
similarity_threshold
values and different base retrievers. For instance, you could try using a lowersimilarity_threshold
value, like 0.75, and see if that returns any results. Here's how you can modify yourEmbeddingsFilter
and base retriever:# Using FAISS as base retriever base_retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever() # Using a lower similarity_threshold relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.75) compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=base_retriever)
Please note that the
similarity_threshold
and the choice of base retriever are hyperparameters that you might need to tune based on your specific use case and dataset.If you find that this solution works for you and you believe it's a bug that could impact other users, you're welcome to make a pull request with your findings. This could be a valuable contribution to the LangChain community.
Sources
- docs/snippets/modules/data_connection/retrievers/contextual_compression/get_started.mdx
- tests/integration_tests/retrievers/test_contextual_compression.py
This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
Thanks bot, But this way does not work.
I have tried to use similarity_threshold = 0.81
to retrieve content by directly using code below, and got a not null list as a result.
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
compression_retriever = ContextualCompressionRetriever(base_compressor=relevant_filter, base_retriever=retriever)
compressed_docs = compression_retriever.get_relevant_documents("replace question with your own")
pretty_print_docs(compressed_docs)
So, Any idea?
So, Anyone has faced this issue? Tried to fork the code and made some re-implementation but got a fail. Need some help here.
Hey I've hadn't face this issue yet haha but my goal was to reduce the amount of documents retrieved by using compressions tried one of the first examples which used LLMChainExtractor and it works well it returned the exact document out of 6 but adding that to the whole process increase the number of tokens used might as well didn't used compression in the first place, but I think it depends on the documents or your index though.
Here's an overview of my code below
exisiting_index_1 = Pinecone.from_existing_index(index_name, embeddings, namespace='random-namespace')
retriever_1 = exisiting_index_1.as_retriever(search_kwargs={"k": 6})
compressor_1 = LLMChainExtractor.from_llm(llm_standard)
compression_retriever_1 = ContextualCompressionRetriever(base_compressor=compressor_1, base_retriever=retriever_1)
ss_general_info = RetrievalQA.from_chain_type(
llm=llm, chain_type="stuff", retriever=compression_retriever_1, return_source_documents=True
Tried using the embeddings_filter
embeddings_filter_1 = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
compression_retriever_1 = ContextualCompressionRetriever(base_compressor=embeddings_filter_1, base_retriever=retriever_1)
It gave me an error of InvalidRequestError: This model's maximum context length is 16385 tokens. However, your messages resulted in 65086 tokens. Please reduce the length of the messages. 🤣
Hey I've hadn't face this issue yet haha but my goal was to reduce the amount of documents retrieved by using compressions tried one of the first examples which used LLMChainExtractor and it works well it returned the exact document out of 6 but adding that to the whole process increase the number of tokens used might as well didn't used compression in the first place, but I think it depends on the documents or your index though.
Here's an overview of my code below
exisiting_index_1 = Pinecone.from_existing_index(index_name, embeddings, namespace='random-namespace') retriever_1 = exisiting_index_1.as_retriever(search_kwargs={"k": 6}) compressor_1 = LLMChainExtractor.from_llm(llm_standard) compression_retriever_1 = ContextualCompressionRetriever(base_compressor=compressor_1, base_retriever=retriever_1) ss_general_info = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=compression_retriever_1, return_source_documents=True
I'll be trying out the solution you've suggested, and I appreciate your willingness to share it. Actually, in the realm of software development, it's commonplace to use multiple solutions to achieve a single goal. However, within a tool-library, such a practice should ideally not occur. Confusing🤣.
Hi, @sawyerbutton! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
Based on my understanding of the issue, the problem is that the ContextualCompressionRetriever
is not returning any context when used with the RetrievalQA.from_chain_type()
method, while the FAISS
retriever successfully returns the context. There have been some suggestions in the comments, such as experimenting with different similarity_threshold
values and different base retrievers. Another user shared their code using LLMChainExtractor
and mentioned that adding compression increased the number of tokens used.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!