policy-optimization topic

List policy-optimization repositories

POP3D

44
Stars
2
Forks
Watchers

Policy Optimization with Penalized Point Probability Distance: an Alternative to Proximal Policy Optimization

car-racing-ppo

40
Stars
6
Forks
Watchers

Implementation of a Deep Reinforcement Learning algorithm, Proximal Policy Optimization (SOTA), on a continuous action space openai gym (Box2D/Car Racing v0)

policy_optimization

23
Stars
3
Forks
Watchers

Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)