on-policy topic

List on-policy repositories

recurrent-ppo-truncated-bptt

107
Stars
14
Forks
Watchers

Baseline implementation of recurrent PPO using truncated BPTT

episodic-transformer-memory-ppo

147
Stars
17
Forks
Watchers

Clean baseline implementation of PPO using an episodic TransformerXL memory

reinforcement_learning_v_mpo

16
Stars
1
Forks
Watchers

Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

reinforcement_learning_truly_ppo

17
Stars
1
Forks
Watchers

Deep Reinforcement Learning by using Truly Proximal Policy Optimization in Tensorflow 2 and Pytorch