langchain
langchain copied to clipboard
Weaviate Hybrid Search doesn't return source
I'm trying to use WeaviateHybridSearchRetriever
in ConversationalRetrievalChain
, specified return_source_documents=True
, however it doesn't return the source in meta data. got KeyError: 'source'
WEAVIATE_URL = "http://localhost:8080"
client = weaviate.Client(
url=WEAVIATE_URL,
)
retriever = WeaviateHybridSearchRetriever(client, index_name="langchain", text_key="text")
qa = ConversationalRetrievalChain(
retriever=retriever,
combine_docs_chain=combine_docs_chain,
question_generator=question_generator_chain,
callback_manager=async_callback_manager,
verbose=True,
return_source_documents=True,
max_tokens_limit=4096
)
result = qa({"question": question, "chat_history": chat_history})
source_file = os.path.basename(result["source_documents"][0].metadata["source"])
I don't like this about weaviate sdk.
For now you can specify source here:
WeaviateHybridSearchRetriever(..., attributes=['source'])
but then this is not very reliable and won't even work when you have different kind of sources in the same vectorstore. For eg. csv and pdf as data sources together.
Hi, @chrischjh! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
Based on my understanding, the issue you reported was regarding the WeaviateHybridSearchRetriever
in ConversationalRetrievalChain
not returning the source in the metadata when return_source_documents=True
, and a KeyError
being raised. rohitgr7 commented on this issue and suggested a workaround by specifying the source in the attributes parameter. However, they mentioned that this workaround may not be reliable when different kinds of sources are used together.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution to the LangChain repository, and please don't hesitate to reach out if you have any further questions or concerns.