llm-aligment topic
List
llm-aligment repositories
RewardModelingBeyondBradleyTerry
70
Stars
4
Forks
70
Watchers
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives