vscode-ai-toolkit icon indicating copy to clipboard operation
vscode-ai-toolkit copied to clipboard

Create an eval from traces

Open therealjohn opened this issue 2 months ago • 3 comments

Add new features that make simpler to create evals by using traces as the dataset. Example:

Image

Why this matters

  • Evals are important to get the most out of AI systems but are incredibly time consuming and frustrating to create
  • It can be confusing on what shape the data needs to be in for different eval types

Key scenarios this enables

  1. Detecting prompt regressions in agents
  2. Bulk model and prompt experimentation

MVP

  • The Input and Output can be automatically mapped to eval variables {{query}} and {{response}}. This is an assumption to streamline the experience. The same may be true for mapping to tool_definitions and tool_calls.
  • Start with options that easily map to trace data like Relevance, Task Adherence, Coherence, Similarity, Intent Resolution, and Tool Call Accuracy.

therealjohn avatar Oct 22 '25 15:10 therealjohn

Thank you for contacting us! Any issue or feedback from you is quite important to us. We will do our best to fully respond to your issue as soon as possible. Sometimes additional investigations may be needed, we will usually get back to you within 2 days by adding comments to this issue. Please stay tuned.

Prototype:

https://github.com/user-attachments/assets/0e3a55cc-ba1a-4881-99b3-1bc8ebc03d96

therealjohn avatar Oct 23 '25 20:10 therealjohn

Added to AI Toolkit project backlog and we will plan this post Ignite.

MuyangAmigo avatar Oct 27 '25 06:10 MuyangAmigo