yrlu/reinforcement_learning: Implementation of selected reinforcement learning...

Implementations of Reinforcement Learning Algorithms in Python

Implementations of selected reinforcement learning algorithms with tensorflow.

Implemented Algorithms

(Click into the links for more details)

Advanced

Asynchronized Advantage Actor-Critic (A3C)
Deep Deterministic Policy Gradient (DDPG)

Policy Gradient Methods

REINFORCE with policy function approximation
REINFORCE with baseline

Temporal Difference Learning

Standard epsilon greedy Q-learning
Deep Q-learning

Monte Carlo Methods

Monte Carlo (MC) estimation of action values

Dynamic Programming MDP Solver

Value iteration
Policy iteration - policy evaluation & policy improvement

Environments

envs/gridworld.py: minimium gridworld implementation for testings

Dependencies

Python 2.7
Numpy
Tensorflow 0.12.1
OpenAI Gym (with Atari) 0.8.0
matplotlib (optional)

Tests

Files: test_*.py
Run unit test for [class]:

python test_[class].py

reinforcement_learning
reinforcement_learning copied to clipboard

Metadata

Implementations of Reinforcement Learning Algorithms in Python

Implemented Algorithms

Advanced

Policy Gradient Methods

Temporal Difference Learning

Monte Carlo Methods

Dynamic Programming MDP Solver

Environments

Dependencies

Tests

MIT License

← Metadata

Owner

Metadata

reinforcement_learning reinforcement_learning copied to clipboard

Metadata

Implementations of Reinforcement Learning Algorithms in Python

Implemented Algorithms

Advanced

Policy Gradient Methods

Temporal Difference Learning

Monte Carlo Methods

Dynamic Programming MDP Solver

Environments

Dependencies

Tests

MIT License

← Metadata

Owner

Metadata

reinforcement_learning
reinforcement_learning copied to clipboard