learning-from-human-feedback topic

List learning-from-human-feedback repositories

214

Stars

Forks

Watchers

Chain-of-Hindsight, A Scalable RLHF Method

Stars

Forks

Watchers

ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment