Practical_RL icon indicating copy to clipboard operation
Practical_RL copied to clipboard

A course in reinforcement learning in the wild

Results 44 Practical_RL issues
Sort by recently updated
recently updated
newest added

* https://www.coursera.org/learn/practical-rl/discussions/all/threads/c9NrSHqOEeiv7hJZeB6Vjg * https://www.coursera.org/learn/practical-rl/discussions/all/threads/gYZfgB3kS-OGX4Ad5PvjfA

priority:high

We all know that while RL is great at video games ~~when you throw a lot of duct tape and GPUs at it~~, it isn't quite there for the real...

help wanted
new content

Currently, `week5_policy_based/practice_a3c.ipynb` has numerous problems. * It does not implement A3C. It is a plain actor-critic. * We only have it in Tensorflow, since it does not have a corresponding...

coursera

Not really sure how to do that at our scale. * There is https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/spellchecker/README.html, but it only works when you edit the cell, so it's a lot of manual work....

help wanted
maintenance
text in notebooks

Currently we evaluate submissions in the notebooks themselves, and only submit the score. The grader should do the evaluation; preferably, this should be outside of the user-modifiable code entirely.

coursera

We have a bunch of `atari_util.py` variations: ``` master:week04_approx_rl/atari_wrappers.py master:week06_policy_based/atari_wrappers.py master:week08_pomdp/atari_util.py coursera:week4_approx/atari_util.py coursera:week5_policy_based/atari_util.py ``` Things can be simplified if we instead reuse https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/atari_wrappers.py.

maintenance

Currently, our directory structure is basically a bunch of `week*` folders at the top level. However, some code is consistently reused across weeks, the most prominent example being launching `xvfb`....

maintenance