opik
opik copied to clipboard
[FR]: UI: See Traces and LLM Calls of Evaluations when using LLM as Judges metrics
Proposal summary
See in the UI the traces that are produced when using LLM as Judges, both General and LLM spans.
Motivation
When using LLM as Judged metrics in your Evaluations, it is useful to track the Evaluation because:
- Sometimes you want to iterate on the creation of the actual metric, and you need to analyze / compare the prompts that were used to evaluate the LLM-App.
- It allows tracking of the full usage/cost of the Evaluation
@jverre I like this idea. However, we need to decide how enable/disable it. We'll likely need to have some flags to disable/enable opik tracking for score functions
This has been implemented and released