Add support for saving intermediate results during vf-eval

Open bsagevedant opened this issue 6 months ago • 2 comments

Save Intermediate Results During vf-eval

This PR addresses issue #251 by adding support for saving intermediate results during evaluation and enabling interleaved reward computation.

Changes

Added configuration options to Environment class:
- save_intermediate: Enable saving intermediate results during rollout
- interleave_rewards: Enable computing rewards after each rollout instead of batching
Modified run_rollouts method to:
- Support saving intermediate results after each rollout
- Support interleaving reward computation
- Make both features optional and configurable
Added comprehensive tests in test_intermediate_results.py

Testing

Added new test cases that verify:

Intermediate results saving functionality
Interleaved reward computation
Configuration options
Integration with existing evaluation methods

Notes

The interleaved reward computation is optional as it's not fully compatible with some pairwise reward strategies
Intermediate results are logged using the environment's logger, which can be customized by the user

Sep 24 '25 03:09 bsagevedant

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

:white_check_mark: willccbb
:x: Your GitHub Username

Your GitHub Username seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Sep 24 '25 03:09 CLAassistant

nice! looks pretty good, updated to merge with latest main -- probably will make some other edits before merging, our logic for vf-eval outputs json saving has drifted a bit from make_dataset + ideally we bring these back in sync so that intermediate saving would handle vf-eval -s directly.

Sep 30 '25 07:09 willccbb