[AI Evaluation] Microsoft.Extensions.AI.Evaluation evaluators don't handle responses that used tool calls well.
Description
As an example, the CoherenceEvaluator, makes use of the TryGetUserRequest() extensions method which doesn't "try" very hard as it only looks at the last message to see if it's Role is ChatRole.User, and if it's not it returns false and sets the userRequest object to null. When tools are available to the model and are used, the last message in the conversation will likely be a tool message with a Role of ChatRole.Tool.
I think the Evaluators should be looking back to the last message that has Role set to ChatRole.User, or at least as far back the previous message with ChatRole.Assistant and the evaluation prompt should be updated to take the tool messages into account as well, as the Coherence of the response may depend on what data they return as well.
Reproduction Steps
- Have a
IChatClientinstance - Call
.GetResponseAsync(chatMessages, callOptions, cancellationToken)with acallOptionsobject that has tools configured and achatMessagewith content that would cause the model to use at least one of the configured tools. - Create a
CoherenceEvaluatorinstance and call.EvaluateAsyncpassing in the messages from the response.
Expected behavior
The evaluator is able to judge the coherence of the response, and include the most recent user request and any tools message since in it's reasoning.
Actual behavior
The evaluation include a thought chain including something like:
First, I need to identify what the QUERY is - but I notice the QUERY field is completely empty. There is no question or prompt provided
Regression?
No response
Known Workarounds
No response
Configuration
.NET 9 Microsoft.Extensions.AI, Microsoft.Extensions.AI.Evaluation and Microsoft.Extensions.AI.Evaluation.Quality version 9.9.0
Other information
https://github.com/dotnet/extensions/blob/53ef1158f9f42632e111d6873a8cd72b803b4ae6/src/Libraries/Microsoft.Extensions.AI.Evaluation.Quality/CoherenceEvaluator.cs#L89-L91
https://github.com/dotnet/extensions/blob/53ef1158f9f42632e111d6873a8cd72b803b4ae6/src/Libraries/Microsoft.Extensions.AI.Evaluation/ChatMessageExtensions.cs#L17-L44
https://github.com/dotnet/extensions/blob/53ef1158f9f42632e111d6873a8cd72b803b4ae6/src/Libraries/Microsoft.Extensions.AI.Evaluation.Quality/CoherenceEvaluator.cs#L182-L184