verifiers icon indicating copy to clipboard operation
verifiers copied to clipboard

feat(eval): add --no-interleave-scoring flag

Open ob1-s opened this issue 2 months ago • 0 comments

Description

This PR adds a --no-interleave-scoring flag to vf-eval so we can control whether evaluate will do interleaved scoring or not.

Type of Change

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] Documentation update
  • [ ] Test improvement

Testing

  • [x] All existing tests pass when running uv run pytest locally.
  • [x] New tests have been added to cover the changes

Checklist

  • [x] My code follows the style guidelines of this project as outlined in AGENTS.md
  • [x] I have performed a self-review of my own code
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [x] I have made corresponding changes to the documentation
  • [x] My changes generate no new warnings
  • [x] Any dependent changes have been merged and published

Additional Notes

ob1-s avatar Oct 15 '25 21:10 ob1-s