reinforcement-learning icon indicating copy to clipboard operation
reinforcement-learning copied to clipboard

Minimal and Clean Reinforcement Learning Examples

Results 39 reinforcement-learning issues
Sort by recently updated
recently updated
newest added

Bumps [numpy](https://github.com/numpy/numpy) from 1.12.1 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...

dependencies

Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 1.0.0 to 2.7.2. Release notes Sourced from tensorflow's releases. TensorFlow 2.7.2 Release 2.7.2 This releases introduces several vulnerability fixes: Fixes a code injection in saved_model_cli (CVE-2022-29216) Fixes...

dependencies

Bumps [pillow](https://github.com/python-pillow/Pillow) from 4.1.0 to 9.0.1. Release notes Sourced from pillow's releases. 9.0.1 https://pillow.readthedocs.io/en/stable/releasenotes/9.0.1.html Changes In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [@​radarhere, @​hugovk] Restrict builtins within...

dependencies

I am running the script [here](https://github.com/rlcode/reinforcement-learning/blob/master/2-cartpole/3-reinforce/cartpole_reinforce.py) but even after 500 episodes it does not converge. You can see the graph I get below: ![score](https://user-images.githubusercontent.com/14084682/129976668-ec03b85d-e44f-429c-a59a-7a7354be4b72.png) In contrast this is the supposedly...

Hello, I saw there was run.py script for running the example in README. But I can't find that script. How can I get full-code example? Thanks, Regards.

Any idea how to go about implementing diagonal movement in grid score?

I want to run other atari game, it's performance looks doesn't good. Could anyone help me? Whether I can achieve this goal by change "gym.make('ENV_NAME')", and it's real_action? help me...

Create plots and further extensions

Hi, I would like to test some hyperparameters, with using threading, that will be much faster. But when I run threading on DQN and DDQN algorithm, the error says: Seems...