Hi Team, I am using ragas ==0.2.15. I have created an Investment Research Assistant in LangChain-based conversational AI system designed to guide users in making informed investment decisions. It supports real-time financial data access (via yFinance & NewsAPI), performs retrieval-augmented generation (RAG) with OpenAI Embeddings, and includes multi-turn evaluation using RAGAS.

input query : here are 3 stocks - NVIDIA,Sofi technologies and Tesla. customer doesn't care much about sustainability. however customer is risk averse can you please recommend which stock he should buy?

response : I recommend NVIDIA for the risk-averse customer.

I have attached my code here for reference.

✅ RAGAS multi-turn

if tool_usage:
    tool_calls = [ToolCall(name=tool, args={"company": company}) for tool in set(tool_usage)]
else:
    tool_calls = []
ragas_sample = MultiTurnSample(
    user_input=[
        RHMessage(content=message),
        RAMessage(content=reasoning_text, 
                tool_calls=tool_calls
                  )
    ],
    reference_tool_calls=tool_calls,
    reference_topics=["Stock price analysis and investment recommendations"],
    reference=answer
)


tool_score = ToolCallAccuracy().multi_turn_score(ragas_sample)
evaluator_llm = llm_factory(model="gpt-3.5-turbo")
goal_scorer = AgentGoalAccuracyWithReference(llm=evaluator_llm)

# 3️⃣ Run scoring in async context
async def run_goal_accuracy():
    agent_score = await goal_scorer.multi_turn_ascore(ragas_sample)
    print("Agent Goal Accuracy:", agent_score)  # score.score is a float from 0 to 1
    return agent_score

# 4️⃣ Run the async function
goal_score = asyncio.run(run_goal_accuracy())
print("====goalscore=====")
print(goal_score)
scorer = TopicAdherenceScore(llm = evaluator_llm, mode="recall")
async def evaluate_topic_adherence():
    score = await scorer.multi_turn_ascore(ragas_sample)
    return score

# Run and store the result
topic_score = asyncio.run(evaluate_topic_adherence())
print("Tool usage:", tool_usage)
print("ans usage:", answer)

ragas_scores = {
    "single_turn": single_scores,
    "multi_turn": {
        "ToolCallAccuracy": tool_score,
        "AgentGoalAccuracy": goal_score,
        "TopicAdherenceScore":float(topic_score) if topic_score is not None else None
        
    }
}

return answer, ragas_scores

Please help me to achieve goal score. If i use with reference and without reference , there is no change in the score value.

Thanks, Divya

Jul 18 '25 07:07 divyamuralidharan-ai

Can someone please help with the above question?

Jul 29 '25 06:07 divyamuralidharan-ai

Hi team, I’d like to work on this issue (#2122) regarding “Getting always 0 in agent evaluation with Agent Goal accuracy.”

From my understanding, the problem seems related to how AgentGoalAccuracyWithReference computes scores in multi-turn evaluation, possibly not handling certain MultiTurnSample structures correctly. I’d like to investigate this and propose a fix. Could you please assign this issue to me?

Thanks!

Sep 04 '25 04:09 Rahul2512Chauhan