Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

Implementation design issues in SubprocVectorEnv

> It should be possible to be generic enough to get rid of "reset", "render" and "seed", and "getattr" elif clauses. You are probably looking for `env_method` as done in...

Episode start signal not used in RNN for on-policy algorithms

> I think the current implementation has already done this: this is for data collection only, the reset should be done when updating the networks too.

Allow different data types in her replay buffer

Hello, I do agree for observation and goal and we can probably address it in https://github.com/DLR-RM/stable-baselines3/pull/704

N-step updates for off-policy methods

Funny, I also recently gave it a try here: https://github.com/DLR-RM/stable-baselines3/tree/feat/n-steps

N-step updates for off-policy methods

>The bug is basically the same as with memory optimization. :see_no_evil: Yep, I did only quick test with it and could not see any improvement yet. >Completed it. Removed loops,...

N-step updates for off-policy methods

>Ps. sorry for autoformatting >.< ... I will do a PR soon for that ;) Apparently will be with black.

N-step updates for off-policy methods

I'm currently given a quick try to that one (`feat/n-steps` in the zoo), and it already yields some interesting results with DQN on CartPole ;) And I couldn't notice any...

N-step updates for off-policy methods

>Looks good! What about the FPS? It should have very small impact, but, there are still some optimizations that can be made. On DQN with CartPole, as mentioned, I couldn't...

N-step updates for off-policy methods

>For sac, I am not sure if n-steps can be applied directly as I am under the impression that the backup requires the entropy for the intermediate states as well,...

N-step updates for off-policy methods

I added a sketch of how it would look like for SAC, it fits in ~10 lines. We would need to allocate one more array `log_prob` of size `buffer_size` and...