evals
evals copied to clipboard
Support multiple completions for ModelbasedClassify
Describe the feature or improvement you're requesting
It would be nice to be able to score multiple sample completions using ModelBasedClassify. Even if n>1 is passed into a completion function and multiple samples are returned, only the first is graded because of this line:
https://github.com/openai/evals/blob/main/evals/elsuite/utils.py#L193
Additional context
I would like to be able to raise the temperature, ask a model to produce N completions, and have each completion graded separately using a rubric. This appears to work fine for non-model-based scoring.