rl_learn icon indicating copy to clipboard operation
rl_learn copied to clipboard

我的强化学习笔记和学习材料:book: still updating ... ...

[WIP]强化学习的学习仓库

这是我个人学习强化学习的时候收集的比较经典的学习资料、笔记和代码,分享给所有人。

为了直接在GitHub上用markdown文件看公式,推荐安装chrome插件:MathJax Plugin for Github

入门指南

  • 入门指南

课程笔记

  • David Silver 的 Reinforcement Learning 课程学习笔记。

  • 课程对应的所有PPT

  • Sutton 的 Reinforcement Learning: An Introduction书本学习笔记

    • 1. Introduction
    • 2. Multi-armed Bandits
    • 3. Finite Markov DecisionProcesses
    • 4. Dynamic Programming
    • 5. Monte Carlo Methods
    • 6. Temporal-Difference Learning
    • 7. n-step Bootstrapping
    • 8. Planning and Learning with Tabular Methods
    • 9. On-policy Prediction with Approximation
    • 10. On-policy Control with Approximation
    • 11. Off-policy Methods with Approximation
    • 12. Eligibility Traces
    • 13. Policy Gradient Methods
    • 14. Psychology
    • 15. Neuroscience
    • 16. Applications and Case Studies
    • 17. Frontiers
  • 书本的各版本pdf

    • 2017-6 draft
    • 2018 second edition

实验目录

所有的实验源代码都在lib目录下,来自dennybritz。在原先代码的基础上,增加了对实验背景的具体介绍、代码和公式的对照。

  • Gridworld:对应MDPDynamic Programming
  • Blackjack:对应Model FreeMonte Carlo的Planning和Controlling
  • Windy Gridworld:对应Model FreeTemporal DifferenceOn-Policy ControllingSARSA
  • Cliff Walking:对应Model FreeTemporal DifferenceOff-Policy ControllingQ-learning
  • Mountain Car:对应Q表格很大无法处理(state空间连续)的Q-Learning with Linear Function Approximation
  • Atari:对应Deep-Q Learning

其他重要学习资料: