Antonin RAFFIN

Results 769 comments of Antonin RAFFIN

> How would you go about it in relation to the vectorised replay buffer that SB3 uses: have one segment tree hold priorities across all envs or have a segment...

Hello, > As the comments indicate, the continuous actions obtained in line 395 should have already been scaled by tanh, which puts them in the range (-1, 1). i think...

> I wonder why we need to store the unscaled action in the replay buffer instead of the final action actually taken in the environment. we need to store the...

> In particular, why multiply by 2 and subtract 1? how would you do it otherwise?

Hello, is the simulation asynchronous? Please fill the custom gym env template completely. If you want a full video serie of SB3 and car racing (with open source code), you...

@timothe-chaumont could you review/test this one?

@timothe-chaumont thanks for reviewing. > Modify the HParam class to ask for metrics names only (without values): yes, look like a better fix, but `hparams` is asking for a non-empty...

> [ ] I have raised an issue to propose this change ([required](https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md) for new features and bug fixes)

Hello, > but I think it would be a good idea to allow to check the correctness of vectorized envs too. yes, would be a good idea =) > but...

Thanks for opening the issue =) After thinking more about it, I think the current implementation is correct: we reset the episodic return when a done signal is received. An...