llm-aligment topic

List llm-aligment repositories

RewardModelingBeyondBradleyTerry

70
Stars
4
Forks
70
Watchers

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives