minimalRL
minimalRL copied to clipboard

Published 20 hours ago •

→

Metadata

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Reame
Issues

Results 22 minimalRL issues

Sort by recently updated

Remove redundant while loop break in dqn.py

The training loop in dqn.py has both `while not done` and `if done: break`. This is harmless, but redundant. Given this repo's focus on minimalism, though, I thought the break...

DQN why train iterate for 10 times

https://github.com/seungeunrho/minimalRL/blob/master/dqn.py https://github.com/seungeunrho/minimalRL/blob/7597b9af94ee64536dfd261446d795854f34171b/dqn.py#L63 I am wondering why the `train` method is internally looping 10 times? Shouldn't the policy network train per action?

MuZero minimal implementation

Hi, First congratulations by this project. Would be great a minimal implementation of MuZero algorithm. The paper is here: https://arxiv.org/pdf/1911.08265 The pseudocode is: https://arxiv.org/src/1911.08265v2/anc/pseudocode.py Thanks.

Implemented r2d2

Bit long.. 298 lines

implemented ape-x

Minimal way to save / replay trained model?

I'm somewhat new to the field of reinforcement learning, and I find these simplistic examples to be extremely helpful -- thank you! Would you be able to help me with...

Add new algorithms

7

comment

It would be nice to add the following algorithms: - [ ] RAINBOW - [x] A2C (multiprocessing) I will submit a PR if I finish any of them.

Query about LSTM

Hello, nice and clear implementation! I want to ask something about the LSTM usage. While gatthering experience the input to the LSTM is of dimension [1, 1, 64] which represents...

Add meta RL algorithms?

Hello, I have enjoyed reading your good examples! Is it possible for you to add a few meta RL algorithms? Thanks!

Cartpole environment with Multidiscrete action space

Hi, I am trying to create an environment that is a variation of Cartpole. From the Cartpole definiton: > The studied system is a cart of which a rigid pole...

1
2
3
›

About

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

pytorch

machine-learning

deep-learning

reinforcement-learning

simple

deep-reinforcement-learning

dqn

a2c

a3c

ddpg

ppo

reinforce

sac

acer

policy-gradients

2.8k

Stars

451

Forks

Watchers

Owner

← Metadata

2.8k

Stars

451

Forks

Watchers

Owner

Metadata

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)