Practical_RL issues

Results 44 Practical_RL issues

Sort by recently updated

Bandit assignment is too unstable

* https://www.coursera.org/learn/practical-rl/discussions/all/threads/c9NrSHqOEeiv7hJZeB6Vjg * https://www.coursera.org/learn/practical-rl/discussions/all/threads/gYZfgB3kS-OGX4Ad5PvjfA

dniku

priority:high

Real-world applications

We all know that while RL is great at video games ~~when you throw a lot of duct tape and GPUs at it~~, it isn't quite there for the real...

dniku

help wanted

new content

Make seq2seq assignment graded in Coursera

Currently it isn't.

dniku

coursera

Do something about the actor-critic Coursera assignment

Currently, `week5_policy_based/practice_a3c.ipynb` has numerous problems. * It does not implement A3C. It is a plain actor-critic. * We only have it in Tensorflow, since it does not have a corresponding...

dniku

coursera

Spellcheck all notebooks

Not really sure how to do that at our scale. * There is https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/spellchecker/README.html, but it only works when you edit the cell, so it's a lot of manual work....

dniku

help wanted

maintenance

text in notebooks

Coursera: evaluate submissions inside grader

Currently we evaluate submissions in the notebooks themselves, and only submit the score. The grader should do the evaluation; preferably, this should be outside of the user-modifiable code entirely.

dniku

coursera

Replace all atari_util.py variations with an import from stable_baselines.common.atari_wrappers

We have a bunch of `atari_util.py` variations: ``` master:week04_approx_rl/atari_wrappers.py master:week06_policy_based/atari_wrappers.py master:week08_pomdp/atari_util.py coursera:week4_approx/atari_util.py coursera:week5_policy_based/atari_util.py ``` Things can be simplified if we instead reuse https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/atari_wrappers.py.

dniku

maintenance

Practical_RL
Practical_RL copied to clipboard

Metadata

Bandit assignment is too unstable

Real-world applications

Make seq2seq assignment graded in Coursera

Do something about the actor-critic Coursera assignment

Spellcheck all notebooks

Coursera: evaluate submissions inside grader

Replace all atari_util.py variations with an import from stable_baselines.common.atari_wrappers

Move actor-critic theory from week08_pomdp/practice_pytorch.ipynb to week06_policy_based/a2c-optional.ipynb

Find a way to update plots without flickering

Put all code that we reuse across weeks in an importable library

← Metadata

Owner

Metadata

Practical_RL Practical_RL copied to clipboard

Metadata

← Metadata

Owner

Metadata

Practical_RL
Practical_RL copied to clipboard