agenta icon indicating copy to clipboard operation
agenta copied to clipboard

[Feature]: expose running evaluators via API to playground

Open aybruhm opened this issue 1 year ago • 4 comments

Description

This PR exposes the ability to run evaluators via an API.

Evaluators that have been tested

The following evaluators have been tested by the backend tests (and from the UI):

  • exact match
  • similarity match
  • regex test
  • webhook test
  • AI critique
  • starts with
  • contains
  • contains any
  • contains all
  • contains JSON
  • JSON diff
  • Levenshtein distance
  • RAG faithfulness
  • RAG context relevancy

The following evaluators have only been tested from the UI:

  • field match
  • custom code

What to QA

The QA process should involve running the evaluators mentioned above from the UI.

Related Issue

Closes AGE-491

aybruhm avatar Aug 01 '24 20:08 aybruhm

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
agenta ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 29, 2024 8:25am
agenta-documentation ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 29, 2024 8:25am

vercel[bot] avatar Aug 01 '24 20:08 vercel[bot]

What is the QA status for this?

mmabrouk avatar Aug 14 '24 19:08 mmabrouk

What is the QA status for this?

PR description updated with acceptance tests, and have been handed over to @zenUnicorn for QA.

aybruhm avatar Aug 15 '24 20:08 aybruhm

PR description updated with acceptance tests, and have been handed over to @zenUnicorn for QA.

Thanks @aybruhm

QA Report (Testing Evaluator from the UI)

  • exact match ✅
  • similarity match ✅
  • regex test ✅
  • field match ✅
  • AI critique ✅
  • contains JSON ✅
  • JSON diff ✅
  • starts with ✅
  • contains ✅
  • contains any ✅
  • contains all ✅
  • Levenshtein distance ✅
  • RAG faithfulness ✅
  • RAG context relevancy ✅

zenUnicorn avatar Aug 16 '24 11:08 zenUnicorn