Project Deprecated

Please note that there will not be any updates to this project in the foreseeable future. Please do not add any issues to this repo expecting a fix or explanation. Some of the libraries have had breaking updates (gym) and my requirements.txt did not state the version requirements and its pretty much impossible to reproduce the experiments now. However, there is value in looking at the implementation of the various RL algorithms.

Please consider forking this project if you want to continue working on it and provide support with newer environments and libraries.

Deep RL policies on Pybullet Environments

This repo is a pytorch implementation of various deep RL algorithms, trained and evaluated on pybullet robotic environments.

Dependencies:

CUDA >= 10.2
RLBench

Implemented Algorithms:

Name	Discrete actions	Continuous actions	Stochastic policy	Deterministic policy
DDPG	:x:	:heavy_check_mark:	:x:	:heavy_check_mark:
TD3	:x:	:heavy_check_mark:	:x:	:heavy_check_mark:
TRPO	:heavy_check_mark:	:heavy_check_mark:	:heavy_check_mark:	:x:
PPO	:heavy_check_mark:	:heavy_check_mark:	:heavy_check_mark:	:x:
Option-Critic	:heavy_check_mark:	:heavy_check_mark:	:heavy_check_mark:	:x:
DAC_PPO	:heavy_check_mark:	:heavy_check_mark:	:heavy_check_mark:	:x:

Environments Supported

The following gym environments are supported on this repo.

OpenAI gym's environments
Pybullet gym environments
RLBench gym environments

Types of Networks Implemented:

Multi-Layered Perceptron (MLP)
Convolutional Neural Network (CNN)
Variational Autoencoders (VAE)
hidden_sizes are the number of neurons in each of the dense layer of the MLP.
conv_layer_sizes is a list containing the parameters of each convolutional layer, i.e. [output_channel, kernel_size, stride]

To use mlp neural net, set ac_kwargs['model_type'] to 'mlp'

"ac_kwargs": {
    "model_type": "mlp"
    "hidden_sizes": [256, 256]
}

To use cnn neural net, set ac_kwargs['model_type'] to 'cnn'

"ac_kwargs": {
    "model_type": "cnn"
    "hidden_sizes": [512, 256],
    "conv_layer_sizes": [[16, 5, 2],
    [32, 5, 2], 
    [64, 5, 2], 
    [64, 3, 1]]
}

To use cnn neural net, set ac_kwargs['model_type'] to 'vae'.

"ac_kwargs": {
    "model_type": "vae",
    "vae_weights_path": "VAE/output/vae_reach_target-vision-v0_wrist_rgb.pth",
    "hidden_sizes": [512, 256]
}

VAE network

VAE network needs to be pretrained on the environment's images before being used on the RL algorithm. The data generation and training code are provided at VAE directory

Comparison of results in PyBullet Environments

Environment	Learning Curve	Episode Recording
CartPole Continuous BulletEnv-v0
Hopper BulletEnv-v0
AntBulletEnv-v0
HalfCheetahBulletEnv-v0

Results of Option-Critic on RLBench Environments

The agents are trained on the front-rgb camera view to solve the RLBench Manipulation Tasks.

Environment	Learning Curve	Episode Recording
open-box
close-box

How to use

Clone this repo
pip install -r requirements.txt

Training model for openai gym environment

Edit training parameters in ./Algorithms//_config.json

python train.py
usage: train.py [-h] [--env ENV] [--agent {ddpg,trpo,ppo,td3,random}]
                [--arch {mlp,cnn}] --timesteps TIMESTEPS [--seed SEED]
                [--num_trials NUM_TRIALS] [--normalize] [--rlbench] [--image]

optional arguments:
  -h, --help            show this help message and exit
  --env ENV             environment_id
  --agent {ddpg,trpo,ppo,td3,random}
                        specify type of agent
  --arch {mlp,cnn}      specify architecture of neural net
  --timesteps TIMESTEPS
                        specify number of timesteps to train for
  --seed SEED           seed number for reproducibility
  --num_trials NUM_TRIALS
                        Number of times to train the algo
  --normalize           if true, normalize environment observations
  --rlbench             if true, use rlbench environment wrappers
  --image               if true, use rlbench environment wrappers

Testing trained model performance

python test.py
usage: test.py [-h] [--env ENV] [--agent {ddpg,trpo,ppo,td3,random}]
               [--arch {mlp,cnn}] [--render] [--gif] [--timesteps TIMESTEPS]
               [--seed SEED] [--normalize] [--rlbench] [--image]

optional arguments:
  -h, --help            show this help message and exit
  --env ENV             environment_id
  --agent {ddpg,trpo,ppo,td3,random}
                        specify type of agent
  --arch {mlp,cnn}      specify architecture of neural net
  --render              if true, display human renders of the environment
  --gif                 if true, make gif of the trained agent
  --timesteps TIMESTEPS
                        specify number of timesteps to train for
  --seed SEED           seed number for reproducibility
  --normalize           if true, normalize environment observations
  --rlbench             if true, use rlbench environment wrappers
  --image               if true, use rlbench environment wrappers

DeepRL-pytorch
DeepRL-pytorch copied to clipboard

Metadata

Project Deprecated

Deep RL policies on Pybullet Environments

Dependencies:

Implemented Algorithms:

Environments Supported

Types of Networks Implemented:

VAE network

Comparison of results in PyBullet Environments

Results of Option-Critic on RLBench Environments

How to use

Training model for openai gym environment

Testing trained model performance

← Metadata

Owner

Metadata

DeepRL-pytorch DeepRL-pytorch copied to clipboard

Metadata

Project Deprecated

Deep RL policies on Pybullet Environments

Dependencies:

Implemented Algorithms:

Environments Supported

Types of Networks Implemented:

VAE network

Comparison of results in PyBullet Environments

Results of Option-Critic on RLBench Environments

How to use

Training model for openai gym environment

Testing trained model performance

← Metadata

Owner

Metadata

DeepRL-pytorch
DeepRL-pytorch copied to clipboard