giskard
giskard copied to clipboard
Raget: Possible miscalculation of all Ragas metrics, in particular Precision and Recall
Issue Type
Bug
Source
source
Giskard Library Version
2.11
Giskard Hub Version
OS Platform and Distribution
No response
Python version
No response
Installed python packages
No response
Current Behaviour?
Giskard RAGet uses the reference context when calling Ragas.
https://github.com/Giskard-AI/giskard/blob/main/giskard/rag/metrics/ragas_metrics.py
ragas_sample = {
"question": question_sample["question"],
"answer": answer,
"contexts": question_sample["reference_context"].split("\n\n"),
"ground_truth": question_sample["reference_answer"],
}
According to Ragas documentation the retrieved context should be used (the one used for the answer Generation).
As an example, when computing Precision or Recall which both uses {"question", "contexts", "ground_truth"}, if you are giving the reference context, then you are evaluating your test set generation pipeline and not your RAG pipeline.
Standalone code OR list down the steps to reproduce the issue
.
Relevant log output
No response