Anssi

Results 443 comments of Anssi
trafficstars

Ideally yes, SB3 should support that device too (not a big change), but seems like it would, at the moment, require some operation-call changes to fully support. Those need to...

I concur with @araffin. I'd rather keep the current wrappers as is for consistency (despite the unclear naming "ClipReward"), but support for `noop_max=0`.

Definitely sounds like an useful feature. The largest part here would be to figure out where these hyperparameters should be logged and what should be fed in. This sounds like...

> It is also in the documentation: https://stable-baselines3.readthedocs.io/en/master/guide/tensorboard.html#directly-accessing-the-summary-writer. Ah this sounds like a good approach to this, so users can do their own logging. > Well, this is the part...

Hmm you raise a good point! I have not used the newest version of zoo, but at least in the past it did not log all the parameters anywhere (you...

Thanks for reporting this! As per Discord chats, we should update the documentation to indeed reflect this behaviour (potentially same for other off-policy algos as well). I can try to...

Sounds reasonable, but I think we still should discuss bit more about the registry stuff (it was not designed to be used outside the policies that come as-is). I see...

Oh sorry, my bad. I completely forgot `policy_kwargs` exists. Never mind that part of my comment ^^'

Ah alright, hmm... It sounds nifty, but the same time it is starting to make things more complicated with these advanced and not-so-much-used features. One of the features of stable-baselines...

Hey. No, you can not copy the SB2 code over, but yes, you can use the implementation in [stable-baselines3-contrib](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/).