ragas icon indicating copy to clipboard operation
ragas copied to clipboard

Test data generation Ground Truth column is missing

Open kaan9700 opened this issue 5 months ago • 1 comments

Your Question I am using the following code for the evaluation of my dataset. I upgraded recently from 0.1.13 to 0.1.18 to use the new metrics ( noise_sensitivity_relevant, noise_sensitivity_irrelevant). But i got the error that the noise_sensitivity_relevant metric needs the ground_truth column in the dataset. But this column is in the dataset.

import pandas as pd
from datasets import Dataset
from ragas import evaluate, RunConfig
from ragas.metrics import (
    answer_relevancy,
    faithfulness,
    context_recall,
    context_precision,
    context_entity_recall,
    answer_correctness,
    answer_similarity,
    noise_sensitivity_relevant,
    noise_sensitivity_irrelevant
)

from dotenv import load_dotenv

load_dotenv()

df = pd.read_excel('testset_full-100-ES.xlsx')

# Create the 'data_samples' dictionary structure
data_samples = {
    'question': df['question'].tolist(),
    'answer': df['answer'].tolist(),
    'contexts': df['contexts'].apply(lambda x: [x] if pd.notna(x) else []).tolist(),
    'ground_truth': df['ground_truth'].tolist()
}
dataset = Dataset.from_dict(data_samples)
print(dataset)
run_config = RunConfig(timeout=120)

result = evaluate(
    dataset,
    metrics=[
        context_precision,
        faithfulness,
        answer_relevancy,
        context_recall,
        context_entity_recall,
        answer_correctness,
        answer_similarity,
        noise_sensitivity_relevant,
        noise_sensitivity_irrelevant,
    ],
    run_config=run_config
)

df = result.to_pandas()
print(result)
# save evaluation results to csv
df.to_csv('results-100-ES.csv', index=False)

Traceback (most recent call last):
  File "C:\Users\Kaan9\bitbucket\rag\Evaluation\scripts\eval.py", line 33, in <module>
    result = evaluate(
  File "C:\Users\Kaan9\miniconda3\envs\rag\lib\site-packages\ragas\_analytics.py", line 129, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\Kaan9\miniconda3\envs\rag\lib\site-packages\ragas\evaluation.py", line 177, in evaluate
    validate_required_columns(dataset, metrics)
  File "C:\Users\Kaan9\miniconda3\envs\rag\lib\site-packages\ragas\validation.py", line 62, in validate_required_columns
    raise ValueError(
ValueError: The metric [noise_sensitivity_relevant] that that is used requires the following additional columns ['ground_truth'] to be present in the dataset.

kaan9700 avatar Sep 13 '24 20:09 kaan9700