Tabular-RL-with-Python
Tabular-RL-with-Python copied to clipboard
Tabular Reinforcement Learning Algorithms with NumPy & Visualizations with Seaborn
Tabular Reinforcement Learning with Algorithms Python
Python implementation of Tabular RL Algorithms in Sutton & Barto 2017 (Reinforcement Learning: An Introduction) Using only NumPy & basic Python data structures (list, tuple, set, and dictionary) to create environment & create algorithms
Algorithms learning from 4X4 Grid World Environment (From Sutton & Barto 2017, pp. 61)
Tabular Reinforcement Learning Algorithms with NumPy
Visualizations with Seaborn (Policy & Value function)
Contents
0. MDP Environment (Chapter 3, Sutton & Barto 2017)
- Introduction to gridworld environment
1. Dynamic Programming (Chapter 4, Sutton & Barto 2017)
- Policy Evaluation and improvement
- Policy Iteration
- Value Iteration
2. Monte Carlo Methods (Chapter 5, Sutton & Barto 2017)
- Monte Carlo Prediction
- Monte Carlo Exploring Starts
- On Policy Monte Carlo
- Off Policy Monte Carlo
3. Temporal Difference Learning (Chapter 6, Sutton & Barto 2017)
- TD Prediction
- SARSA - On-policy Control
- Q-learning - Off-policy Control
- Double Q-learning - Off-policy Control
4. n-step Bootstrapping (Chapter 7, Sutton & Barto 2017)
- n-step TD Prediction
- n-step SARSA - On-policy Control
- n-step Off-policy learning by Importance Sampling
- n-step Off-policy learning without Importance Sampling