preference-based-reinforcement-learning topic
List
preference-based-reinforcement-learning repositories
DPPO
36
Stars
1
Forks
Watchers
Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)