ragas icon indicating copy to clipboard operation
ragas copied to clipboard

When we generate the Test set for evaluating RAG, how to add the source file name from which the QnA is generated.

Open zaid1212 opened this issue 7 months ago • 1 comments

  • I checked the documentation and related resources and couldn't find an answer to my question.

Question

  • Previously there was a feature which would let you add the name of the source file beside each question. This was helping me to generate another metrics named "Source Relevancy" which will help in knowing if the RAG application picked the right file or not. But now there is no such parameter. How to get the source file name in Test set generated?

Code Example

loader = DirectoryLoader(path, glob="**/*",silent_errors=True,show_progress=True) docs = loader.load()

from ragas.testset import TestsetGenerator

generator = TestsetGenerator(llm=generator_llm, embedding_model=generator_embeddings) dataset = generator.generate_with_langchain_docs(docs, testset_size=10)

Additional context

  • How can I make sure the metadata or the source file name is always generated as a column in my final dataset?

Image

zaid1212 avatar May 23 '25 04:05 zaid1212