rl
rl copied to clipboard
Generic reinforcement learning codebase in TensorFlow
FOR.ai Reinforcement Learning Codebase

Modular codebase for reinforcement learning models training, testing and visualization.
Contributors: Bryan M. Li, Alexander Cowen-Rivers, Piotr Kozakowski, David Tao, Siddhartha Rao Kamalakara, Nitarshan Rajkumar, Hariharan Sezhiyan, Sicong Huang, Aidan N. Gomez
Features
- Agents: DQN, Vanilla Policy Gradient, DDPG, PPO
- Environments:
- Model-free asynchronous training (
--num_workers) - Memory replay: Simple, Proportional Prioritized Experience Replay
- Modularized
- hyper-parameters setting (
--hparams) - action functions)
- compute gradient functions
- advantage estimation
- learning rate schemes
- hyper-parameters setting (
Example for recorded envrionment on various RL agents.
| MountainCar-v0 | Pendulum-v0 | VideoPinball-v0 | procgen-coinrun-v0 |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Requirements
It is recommended to install the codebase in a virtual environment (virtualenv or conda).
Quick install
Configure use_gpu and (if on OSX) mac_package_manager (either macports or homebrew) params in setup.sh, then run it as
sh setup.sh
Manual setup
You need to install the following for your system:
- TensorFlow
- OpenAI Gym
- OpenAI Atari
- OpenAI ProcGen
- FFmpeg
- Additional python packages
pip install -r ../requirements.txt
Quick Start
# start training
python train.py --sys ... --hparams ... --output_dir ...
# run tensorboard
tensorboard --logdir ...
# test agnet
python train.py --sys ... --hparams ... --output_dir ... --test_only --render
Hyper-parameters
Check available flags with --help, defaults.py for default hyper-parameters, and check hparams/dqn.py agent specific hyper-parameters examples.
hparams: Which hparams to use, defined under rl/hparamssys: Which system environment to use.env: Which RL environment to use.output_dir: The directory for model checkpoints and TensorBoard summary.train_steps:, Number of steps to train the agent.test_episodes: Number of episodes to test the agent.eval_episodes: Number of episodes to evaluate the agent.test_only: Test agent without training.copies: Number of independent training/testing runs to do.render: Render game play.record_video: Record game play.num_workers, number of workers.
Documentation
More detailed documentation can be found here.
Contributing
We'd love to accept your contributions to this project. Please feel free to open an issue, or submit a pull request as necessary. Contact us [email protected] for potential collaborations and joining FOR.ai.



