RLHFlow

Results 3 repositories owned by RLHFlow

RLHF-Reward-Modeling

738
Stars
62
Forks
Watchers

Recipes to train reward model for RLHF.

Online-RLHF

381
Stars
44
Forks
Watchers

A recipe for online RLHF and online iterative DPO.