R2R icon indicating copy to clipboard operation
R2R copied to clipboard

Enable Metadata Inclusion in Agent Input for R2R Use Case

Open crelocks opened this issue 9 months ago • 0 comments
trafficstars

Description:
Currently, the Agent endpoint does not include metadata in the input passed to the LLM during processing. This limitation makes it challenging to implement certain use cases, such as generating summaries of news articles while including the original URLs in the output. While a workaround exists using the RAG endpoint and the Completion API, it would be significantly more efficient to handle this in a single step using the Agent endpoint.

Approach for my use case Current Approach:

  1. Ingested documents with the following structure:

    • Content: Title and text of the news article.
    • Metadata: URL of the news article.
  2. Approaches tested:

    • RAG Endpoint: Successfully retrieves metadata but does not pass it to the LLM.
    • Agent Endpoint: Provides references but does not include metadata (e.g., URLs) in the output. Modifying the prompts did not yield the desired results as metadata is not passed by default.
  3. Workaround:

    • Use RAG to retrieve relevant documents and their metadata.
    • Pass parsed search results (including metadata) into the Completion API for final processing.

Examples of Expected Behavior:

  • Query: "Summarize the latest news about AI advancements."
    • Desired Output:
      • "AI research is progressing rapidly with new models introduced in 2025. See more details [here](http://example.com/ai-news)."

Feature Request:
Enable the Agent endpoint to include specific metadata in the input passed to the LLM, allowing the model to reference metadata in its response. This should be configurable to fit diverse use cases.

crelocks avatar Jan 25 '25 19:01 crelocks