gail_gym
gail_gym copied to clipboard
Implementation of Generatve Adversarial Imitation Learning (GAIL) for classic environments from OpenAI Gym.
Generative Adversarial Imitation Learning for gym environments
gail-ppo-tf-gym
This repository provides a TensorFlow implementation of Generatve Adversarial Imitation Learning (GAIL) and Behavioural Cloning (BC) for classic cartpole-v0 environment from OpenAI Gym. (based on Generative Adversarial Imitation Learning, Jonathan Ho & Stefano Ermon.)
Dependencies
- python: 3.5.2
- TensorFlow: 1.1.0
- gym: 0.9.3
Gym environment
- CartPole-v0
- State: Continuous
- Action: Discrete
Implementation of GAIL:
Step: 1 Generate expert trajectory data
Reinforcement Learning algorithm: PPO, is used for generating the expert trajectory data for the CartPole-v0 environment.
python3 run_ppo.py
Step: 2 Sample the expert trajectory data from the PPO generated trajectories.
python3 sample_trajectory.py
Step: 3.1 Execute Imitation Learning - GAIL.
python3 run_gail.py
Step: 3.2 To run behavioral cloning
python3 run_behavior_clone.py
Step: 4 Test trained policy for GAIL
python3 test_policy.py
Tensorboard Plots:
![]() |
![]() |
---|---|
Training and Testing results for GAIL |
Note: If you want to test bc policy, specify the number of model.ckpt-number in the directory trained_models/bc
For example to test behavioral cloning:
python3 test_policy.py --alg=bc --model=1000
gail-ppo-pytorch-gym
This repository provides a Pytorch implementation of Generatve Adversarial Imitation Learning (GAIL) for bipedwalker-v2 environment from OpenAI Gym.
Gym environment
-
Bipedwalker-v2
-
State space (Continuous): (1) hull angle, (2) angular velocity, (3) horizontal speed, (4) vertical speed, (5) position of joints (6) joints angular speed, (7) legs contact with ground, and (8) lidar rangefinder measurements
-
Action: joint motor torques
PPO generated expert trajectories:
Imitation learning based on GAIL