verifiers
verifiers copied to clipboard
feat(eval): add support for grouped reward summaries and reports
- Add command-line options to group rewards by task or custom keys
- Implement grouped summary computation for evaluation results
- Log grouped reward and metric statistics during evaluation
- Enhance HTML report to show grouped reward summaries by specified keys
- Add grouped metric calculations in GRPOTrainer for detailed analysis
- Ensure backward compatibility with default grouping by task
- Handle exceptions in HTML report generation gracefully
Soon, I will add screenshots solves #298 testing, open for review.
Ah we deprecated the report feature, sorry!