oso icon indicating copy to clipboard operation
oso copied to clipboard

Attach expected SQL queries to evals

Open evanameyer1 opened this issue 8 months ago • 2 comments

What is it?

Add expected SQL queries associated with each eval. This will be crucial with building tools to actually evaluate and keep track of the accuracy of the model.

I'm thinking I can create some sort of sandbox testing environment as well, so we can easily hook up any model to our evals and view results. For example, I could run gemini's new DS agent through it as a way to evaluate it: https://github.com/opensource-observer/oso/issues/3673

evanameyer1 avatar May 02 '25 02:05 evanameyer1

I suggest we find a way to build this into Phoenix, rather than something bespoke. I'd sync with @ravenac95 on this

ryscheng avatar May 04 '25 19:05 ryscheng

Closed here: https://github.com/opensource-observer/oso/pull/4013/

evanameyer1 avatar Jun 03 '25 21:06 evanameyer1