Adding custom question and ground_truth to the Ragas Testset.
How can we add custom questions and ground truth to the testset generated using Ragas TestSetGenerator:
from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
generator = TestsetGenerator.from_llama_index(
generator_llm =llm,
critic_llm = llm,
embeddings = embed_model,
)
testset = generator.generate_with_llamaindex_docs(
nodes,
test_size=5,
distributions = {
"simple": 0.5,
"multi_context": 0.4,
"reasoning": 0.1
}
)
To add custom questions and ground truth to the test set generated using the Ragas TestSetGenerator, you can define new prompts or modify existing ones in the src/ragas/testset/prompts.py file. Here is an example of how you can add a custom question and its ground truth:
- Define a new prompt or modify an existing one to include your custom question and ground truth.
- Add the new prompt to the
testset_promptslist.
Here is an example of adding a custom question and ground truth:
from ragas.llms.prompt import Prompt
custom_question_prompt = Prompt(
name="custom_question",
instruction="Generate a question that can be fully answered from the given context. The question should be formed using the topic provided.",
examples=[
{
"context": "The process of photosynthesis in plants involves converting light energy into chemical energy, using chlorophyll and other pigments to absorb light. This process is crucial for plant growth and the production of oxygen.",
"keyphrase": "Photosynthesis",
"question": "What is the role of photosynthesis in plant growth?",
},
{
"context": "The Industrial Revolution, starting in the 18th century, marked a major turning point in history as it led to the development of factories and urbanization.",
"keyphrase": "Industrial Revolution",
"question": "How did the Industrial Revolution mark a major turning point in history?",
},
{
"context": "The process of evaporation plays a crucial role in the water cycle, converting water from liquid to vapor and allowing it to rise into the atmosphere.",
"keyphrase": "Evaporation",
"question": "Why is evaporation important in the water cycle?",
},
],
input_keys=["context", "keyphrase"],
output_key="question",
output_type="str",
)
# Add the custom prompt to the testset_prompts list
testset_prompts.append(custom_question_prompt)
This code snippet defines a new prompt named custom_question and adds it to the testset_prompts list. You can modify the examples list to include your custom questions and ground truth.
Additionally, you can prepare your test dataset by ensuring it contains the required attributes for evaluation: question, contexts, answer, and ground_truth. You can use the following steps to load and prepare your dataset:
from datasets import load_dataset
# Load your dataset
eval_dataset = load_dataset("explodinggradients/prompt-engineering-guide-papers")
eval_dataset = eval_dataset["test"].to_pandas()
# Extract the required attributes
test_questions = eval_dataset["question"].values.tolist()
test_answers = eval_dataset["ground_truth"].values.tolist()
# Build your RAG pipeline and generate responses
query_engine1 = build_query_engine(documents)
result_ds = generate_responses(query_engine1, test_questions, test_answers)
# Convert the result to a pandas DataFrame
result_df = result_ds.to_pandas()
print(result_df.head())
This ensures that your dataset is properly prepared with the necessary attributes for evaluating your RAG system [1][2].
To continue talking to Dosu, mention @dosu.
@adityamity does this worked for you?