Getting always 0 in agent evaluation with Agent Goal accuracy
Hi Team, I am using ragas ==0.2.15. I have created an Investment Research Assistant in LangChain-based conversational AI system designed to guide users in making informed investment decisions. It supports real-time financial data access (via yFinance & NewsAPI), performs retrieval-augmented generation (RAG) with OpenAI Embeddings, and includes multi-turn evaluation using RAGAS.
input query : here are 3 stocks - NVIDIA,Sofi technologies and Tesla. customer doesn't care much about sustainability. however customer is risk averse can you please recommend which stock he should buy?
response : I recommend NVIDIA for the risk-averse customer.
I have attached my code here for reference.
✅ RAGAS multi-turn
if tool_usage:
tool_calls = [ToolCall(name=tool, args={"company": company}) for tool in set(tool_usage)]
else:
tool_calls = []
ragas_sample = MultiTurnSample(
user_input=[
RHMessage(content=message),
RAMessage(content=reasoning_text,
tool_calls=tool_calls
)
],
reference_tool_calls=tool_calls,
reference_topics=["Stock price analysis and investment recommendations"],
reference=answer
)
tool_score = ToolCallAccuracy().multi_turn_score(ragas_sample)
evaluator_llm = llm_factory(model="gpt-3.5-turbo")
goal_scorer = AgentGoalAccuracyWithReference(llm=evaluator_llm)
# 3️⃣ Run scoring in async context
async def run_goal_accuracy():
agent_score = await goal_scorer.multi_turn_ascore(ragas_sample)
print("Agent Goal Accuracy:", agent_score) # score.score is a float from 0 to 1
return agent_score
# 4️⃣ Run the async function
goal_score = asyncio.run(run_goal_accuracy())
print("====goalscore=====")
print(goal_score)
scorer = TopicAdherenceScore(llm = evaluator_llm, mode="recall")
async def evaluate_topic_adherence():
score = await scorer.multi_turn_ascore(ragas_sample)
return score
# Run and store the result
topic_score = asyncio.run(evaluate_topic_adherence())
print("Tool usage:", tool_usage)
print("ans usage:", answer)
ragas_scores = {
"single_turn": single_scores,
"multi_turn": {
"ToolCallAccuracy": tool_score,
"AgentGoalAccuracy": goal_score,
"TopicAdherenceScore":float(topic_score) if topic_score is not None else None
}
}
return answer, ragas_scores
Please help me to achieve goal score. If i use with reference and without reference , there is no change in the score value.
Thanks, Divya
Can someone please help with the above question?
Hi team, I’d like to work on this issue (#2122) regarding “Getting always 0 in agent evaluation with Agent Goal accuracy.”
From my understanding, the problem seems related to how AgentGoalAccuracyWithReference computes scores in multi-turn evaluation, possibly not handling certain MultiTurnSample structures correctly. I’d like to investigate this and propose a fix. Could you please assign this issue to me?
Thanks!