Reinforcement-Learning-on-FrozenLake
Reinforcement-Learning-on-FrozenLake copied to clipboard
Reinforcement Learning Algorithms in a simple Gridworld
Reinforcement-Learning-on-FrozenLake
Reinforcement Learning on FrozenLake is a collection of jupyter files that you can learn and try basic reinforcement learning algorithms.
This repo is written for people who want to quickly learn basic concepts of Reinforcement Learning with code.
💡 Features
- Easy explanation of RL concepts
- Interactive RL algorithm
- You can also run RL algorithms in FrozenLake-v1, a OpenAI Gymnasium environment, with hyperparmeter customization.
- Wrapper for the environment can render not only the environment, but also state-action value or model of the algorithms.
☝️ Requirements
- python >= 3.6
- gymnasium >= 0.26.1
- pygame >= 2.3.1
- tensorflow >= 2.8 (for Chapter 7)
- tensorflow-probability >= 0.22.1 (for Chapter 7)
- ipykernel
- ipython
- numpy
- matplotlib
You can install the requirements by using Poetry.
git clone https://github.com/moripiri/Reinforcement-Learning-on-FrozenLake.git
cd Reinforcement-Learning-on-FrozenLake
poetry install #--with ch7 #(If you want to run chapter7.ipynb, optional tensorflow dependency have to be installed.
poetry run python -m ipykernel install --user --name [virtualEnv] --display-name "[displayKernelName]"
where [virtualEnv] is the name of the python environment(ex. rl-introduction-py3.9) and "[displayKernelName]" is the jupyter kernel name you want (ex. frozenlake).
Then run
poetry run jupyter notebook
to run jupyter files.
☝️ Running in Google Colab
If you run any jupyter file in google colab, run the following commands first.
!git clone https://github.com/moripiri/Reinforcement-Learning-on-FrozenLake.git
%cd Reinforcement-Learning-on-FrozenLake/
!pip install gymnasium[classic_control]==0.26.3
📖 Contents
Chapter1: Introduction to Reinforcement Learning
Chapter2: Markov Decision Processes
Chapter3: Dynamic Programming
- Policy iteration
- Value iteration
Chapter4: Model-Free Prediction
- Monte-Carlo Prediction
- TD(0)
Chapter5: Model-Free Control
- On-policy Monte-Carlo Control
- SARSA
- Q-learning
Chapter6: Eligibility Traces
- SARSA(λ)
Chapter7: Policy Gradient Methods
- REINFORCE
- Actor-Critic
Chapter8: Integrating Learning and Planning
- Dyna-Q