langflow
langflow copied to clipboard
agent returns random sources
i am using a vector store agent to retrieve answers from a knowledge base containing pdfs. The response i am getting contains 'answer' and 'source' keys. But the source key contains random data rather than the pdfs loaded in the flow
π€
Hey @sampratha28! π I'm Dosu, your friendly bot assistant. I'm here to help you with bugs, answer your questions, and guide you on your path to becoming a contributor. While we wait for a human maintainer to join us, let's see what we can do together! π€+π¨βπ»=π
Based on the information available in the langflow repository, it seems that the 'source' key in the response of the VectaraSelfQueryRetrieverComponent
is not directly populated within the langflow codebase. The VectaraSelfQueryRetrieverComponent
uses the SelfQueryRetriever
class from the langchain.retrievers.self_query.base
module to retrieve answers from the knowledge base, but it's not clear how the 'source' key is populated within this class.
Here is the relevant code from the VectaraSelfQueryRetrieverComponent
:
def build(
self,
vectorstore: VectorStore,
document_content_description: str,
llm: BaseLanguageModel,
metadata_field_info: List[str],
) -> BaseRetriever:
metadata_field_obj = []
for meta in metadata_field_info:
meta_obj = json.loads(meta)
if 'name' not in meta_obj or 'description' not in meta_obj or 'type' not in meta_obj :
raise Exception('Incorrect metadata field info format.')
attribute_info = AttributeInfo(
name = meta_obj['name'],
description = meta_obj['description'],
type = meta_obj['type'],
)
metadata_field_obj.append(attribute_info)
return SelfQueryRetriever.from_llm(
llm,
vectorstore,
document_content_description,
metadata_field_obj,
verbose=True
)
Source: src/backend/langflow/components/retrievers/VectaraSelfQueryRetriver.py
Unfortunately, without more information about the SelfQueryRetriever
class and how it populates the 'source' key, it's difficult to provide a more definitive answer. It's possible that the 'source' key is populated elsewhere in the langchain codebase, or that it's populated based on the data in the VectorStore
or the BaseLanguageModel
.
If you have more information about the SelfQueryRetriever
class or the data you're using, it might help to clarify this issue. For example, if you could provide a sample of the 'random data' you're seeing in the 'source' key, it might give some clues as to where this data is coming from.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.