[Feature]: expose running evaluators via API to playground
Description
This PR exposes the ability to run evaluators via an API.
Evaluators that have been tested
The following evaluators have been tested by the backend tests (and from the UI):
- exact match
- similarity match
- regex test
- webhook test
- AI critique
- starts with
- contains
- contains any
- contains all
- contains JSON
- JSON diff
- Levenshtein distance
- RAG faithfulness
- RAG context relevancy
The following evaluators have only been tested from the UI:
- field match
- custom code
What to QA
The QA process should involve running the evaluators mentioned above from the UI.
Related Issue
Closes AGE-491
The latest updates on your projects. Learn more about Vercel for Git ↗︎
| Name | Status | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| agenta | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Aug 29, 2024 8:25am |
| agenta-documentation | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Aug 29, 2024 8:25am |
What is the QA status for this?
What is the QA status for this?
PR description updated with acceptance tests, and have been handed over to @zenUnicorn for QA.
PR description updated with acceptance tests, and have been handed over to @zenUnicorn for QA.
Thanks @aybruhm
QA Report (Testing Evaluator from the UI)
- exact match ✅
- similarity match ✅
- regex test ✅
- field match ✅
- AI critique ✅
- contains JSON ✅
- JSON diff ✅
- starts with ✅
- contains ✅
- contains any ✅
- contains all ✅
- Levenshtein distance ✅
- RAG faithfulness ✅
- RAG context relevancy ✅