reinforcement-learning-from-human-feedback topic

List reinforcement-learning-from-human-feedback repositories
trafficstars

Okapi

90
Stars
2
Forks
Watchers

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

safe-rlhf

1.3k
Stars
119
Forks
Watchers

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

alpaca_farm

766
Stars
60
Forks
Watchers

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

minichatgpt

18
Stars
1
Forks
Watchers

annotated tutorial of the huggingface TRL repo for reinforcement learning from human feedback connecting equations from PPO and GAE to the lines of code in the pytorch implementation

OpenRLHF

2.1k
Stars
206
Forks
Watchers

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

CodeUltraFeedback

72
Stars
5
Forks
72
Watchers

CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)

llm_optimization

28
Stars
2
Forks
Watchers

A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.

ReaLHF

95
Stars
4
Forks
Watchers

Super-Efficient RLHF Training of LLMs with Parameter Reallocation