oso icon indicating copy to clipboard operation
oso copied to clipboard

Document how we organize datasets for automatic evals

Open ravenac95 opened this issue 8 months ago • 1 comments

What is it?

We should have a document on how datasets are discovered for automatic eval execution

Current plan:

  • Each of the evals should have a frequency value. The values should be something like cron, on-deployments.
  • If the frequency is set to cron, then a cron value should be set.
  • Each of the datasets should have tags in the style eval:NAME_OF_EVAL where the value is a boolean if that specific eval should be enabled

ravenac95 avatar May 08 '25 16:05 ravenac95

At the very least, we want to be able to specify metadata filters, for example:

!run_eval text2sql where the priority is high or something like that

ryscheng avatar May 28 '25 20:05 ryscheng