minimalRL
minimalRL copied to clipboard
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
The training loop in dqn.py has both `while not done` and `if done: break`. This is harmless, but redundant. Given this repo's focus on minimalism, though, I thought the break...
https://github.com/seungeunrho/minimalRL/blob/master/dqn.py https://github.com/seungeunrho/minimalRL/blob/7597b9af94ee64536dfd261446d795854f34171b/dqn.py#L63 I am wondering why the `train` method is internally looping 10 times? Shouldn't the policy network train per action?
Hi, First congratulations by this project. Would be great a minimal implementation of MuZero algorithm. The paper is here: https://arxiv.org/pdf/1911.08265 The pseudocode is: https://arxiv.org/src/1911.08265v2/anc/pseudocode.py Thanks.
Bit long.. 298 lines
I'm somewhat new to the field of reinforcement learning, and I find these simplistic examples to be extremely helpful -- thank you! Would you be able to help me with...
It would be nice to add the following algorithms: - [ ] RAINBOW - [x] A2C (multiprocessing) I will submit a PR if I finish any of them.
Hello, nice and clear implementation! I want to ask something about the LSTM usage. While gatthering experience the input to the LSTM is of dimension [1, 1, 64] which represents...
Hello, I have enjoyed reading your good examples! Is it possible for you to add a few meta RL algorithms? Thanks!
Hi, I am trying to create an environment that is a variation of Cartpole. From the Cartpole definiton: > The studied system is a cart of which a rigid pole...