Reinforcement-Learning-Code
Reinforcement-Learning-Code copied to clipboard
A repository for code of reinforcement learning algorithms with PyTorch
Reinforcement Learning Code with PyTorch
Papers
- Deep Q-Network (DQN)
- Double DQN (DDQN)
- Advantage Actor-Critic (A2C)
- Asynchronous Advantage Actor-Critic (A3C)
- Deep Deterministic Policy Gradient (DDPG)
- Truncated Natural Policy Gradient (TNPG)
- Trust Region Policy Optimization (TRPO)
- Generalized Advantage Estimator (GAE)
- Proximal Policy Optimization (PPO)
- Soft Actor-Critic (SAC)
- Apprenticeship Learning via Inverse Reinforcement Learning (APP)
- Maximum Entropy Inverse Reinforcement Learning (MaxEnt)
- Generative Adversarial Imitation Learning (GAIL)
- Variational Adversarial Imitation Learning (VAIL)
Algorithms
01. Model-Free Reinforcement Learning
Deep Q-Network (DQN)
Double DQN (DDQN)
Advantage Actor-Critic (A2C)
Asynchronous Advantage Actor-Critic (A3C)
- CartPole(Classic control)
Deep Deterministic Policy Gradient (DDPG)
Truncated Natural Policy Gradient (TNPG)
Trust Region Policy Optimization (TRPO)
TRPO + Generalized Advantage Estimator (GAE)
Proximal Policy Optimization (PPO)
PPO + Generalized Advantage Estimator (GAE)
Soft Actor-Critic (SAC)
- Pendulum(Classic control)
- Hopper(MoJoCo)
02. Inverse Reinforcement Learning
Apprenticeship Learning via Inverse Reinforcement Learning (APP)
Maximum Entropy Inverse Reinforcement Learning (MaxEnt)
Generative Adversarial Imitation Learning (GAIL)
Variational Adversarial Imitation Learning (VAIL)
Learning curve
CartPole

Pendulum
