oso icon indicating copy to clipboard operation
oso copied to clipboard

Devise a scalable workflow for adding new evals

Open evanameyer1 opened this issue 8 months ago • 1 comments

What is it?

Per this "MVP" version here, we need to revisit the categorization and design some sort of workflow that allows this system to:

  1. easily scale to hundreds of evals
  2. offers a way to define new categories/edge cases to be added (as we uncover new ones) with minimal overlap
  3. plugs into other evals infrastructure (Arize Phoenix)

evanameyer1 avatar May 08 '25 16:05 evanameyer1

@ryscheng do you have feedback on this?

ccerv1 avatar May 27 '25 15:05 ccerv1

Ya I like the idea of moving this into code so that we can version and abstract as necessary. Let me circle back when I have an MVP dataset uploader together

ryscheng avatar May 27 '25 22:05 ryscheng

@evanameyer1 lmk if you need more feedback. Feel free to dump the canonical workflow into this issue and close it out when you're ready

ryscheng avatar Jun 03 '25 21:06 ryscheng

@ryscheng Sounds good, I'll get to this tomorrow!

evanameyer1 avatar Jun 03 '25 21:06 evanameyer1

Readme documenting this here: https://github.com/opensource-observer/oso/tree/main/warehouse/oso_agent/oso_agent/datasets/readme.md

evanameyer1 avatar Jun 05 '25 23:06 evanameyer1