reinforcement-learning-from-human-feedback topic
Okapi
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
minichatgpt
annotated tutorial of the huggingface TRL repo for reinforcement learning from human feedback connecting equations from PPO and GAE to the lines of code in the pytorch implementation
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
llm_optimization
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation