preference-based-reinforcement-learning topic

List preference-based-reinforcement-learning repositories

DPPO

36
Stars
1
Forks
Watchers

Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)