reward-bench
reward-bench copied to clipboard
RewardBench: the first evaluation tool for reward models.
Need to get legal approval to use Gemini API, but try this https://openrouter.ai/docs#quick-start
Currently `run_rm.py` only uses one RM because RMs are not well supported generally for inference. Current implementation is a separate `run_rm_mpgu.py` script. We can delete this and improve the base...
I evaluated gemma-2-27b-it {'Chat': 0.8938547486033519, 'Chat Hard': 0.6085526315789473, 'Safety': 0.8867946647946647, 'Reasoning': 0.7705588066786708} while 
I noticed that, for default (sequence classification) models with chat template defined in the tokenizer, `scripts/run_rm.py` formats each conversation by `tokenizer.apply_chat_template` (via the function [`prepare_dialogue_from_tokenizer`](https://github.com/allenai/reward-bench/blob/bc72fb2a573fc31c614eef3405d354b398977b02/rewardbench/utils.py#L515)) and then uses the text...
Hi RewardBench Team, We have updated a 8B reward model (Custom Classifier) [general-preference/GPM-Llama-3.1-8B](https://huggingface.co/general-preference/GPM-Llama-3.1-8B) and a 2b reward model (Custom Classifier) [general-preference/GPM-Gemma-2B](https://huggingface.co/general-preference/GPM-Gemma-2B). Local evaluation results for our models are listed as...
Hi Nathan, I’m currently preparing to release a new repository that contains the code used in my paper. As part of our experiments, we made some slight modifications to the...
See @zankner's repo https://github.com/zankner/CLoud, RM's that think out loud!
Unfortunately also fails with: ``` Traceback (most recent call last): File "/weka/oe-adapt-default/nathanl/reward-bench/scripts/run_rm.py", line 401, in main() File "/weka/oe-adapt-default/nathanl/reward-bench/scripts/run_rm.py", line 309, in main rewards_chosen = reward_pipe(batch["text_chosen"], **reward_pipeline_kwargs) TypeError: list indices must...