Practical_RL
Practical_RL copied to clipboard
A course in reinforcement learning in the wild
I used the following two commands to identify broken links. `markdown-link-check` is https://github.com/tcort/markdown-link-check ``` bash find ./Practical_RL/ -type f -name '*.ipynb' -exec jupyter nbconvert --to markdown {} \; find ./Practical_RL/...
The end of the notebook suggests evaluating the policy in a "theoretically better" way by sampling an initial action for each initial state uniformly and then playing with the current...
Example for `week6/a2c-optional`: actions = torch.tensor(trajectory['actions'], device=device) action_log_probs = torch.gather( trajectory['log_probs'], dim=-1, index=actions.unsqueeze(-1)).squeeze(-1) This _should_ work more efficiently than `to_one_hot`, although to be certain we could also benchmark it.
via @yhn112: Let's choose one of: 1) embed images to notebooks 2) host images in our repo 3) host images on some image hosting 4) find all the images on...
in [this file](https://github.com/yandexdataschool/Practical_RL/blob/master/week3_model_free/seminar_qlearning.ipynb) Where we have the get_qvalue() and set_qvalue() functions in the 2nd code block. This is not really a recommended way to handle getters and setters in python...
There are two options how this can be done: * Cleanup `spring19-pacman-visualization` and merge into `spring19` * Use one of the existing Gym implementations of Pacman: * https://github.com/sohamghosh121/PacmanGym * https://github.com/andreykurenkov/pacman-env
Currently there is a lot of garbage in metadata. Some notebooks refer to nonexistent kernels (like `rl`), others were created with Python 2 and throw an error message if you...
https://github.com/kefranabg/readme-md-generator produces fancy READMEs, maybe we could borrow from those.
It mentions Theano, which is probably not what we care about most in 2019.