reward-bench icon indicating copy to clipboard operation
reward-bench copied to clipboard

Experiment with human vs gpt4 data

Open natolambert opened this issue 1 year ago • 1 comments

With the human data AI2 has or a dataset like no_robots, we could test if a RM prefers the human or model answers to a completion.

natolambert avatar Feb 14 '24 02:02 natolambert

Update: should use an open weights model for completions to private prompts, otherwise company with API has access to closed test set prompts.

natolambert avatar Jul 06 '24 20:07 natolambert