learning-from-human-feedback topic
List
learning-from-human-feedback repositories
chain-of-hindsight
214
Stars
17
Forks
Watchers
Chain-of-Hindsight, A Scalable RLHF Method
exact-optimization
45
Stars
0
Forks
Watchers
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment