Antonin RAFFIN

Results 880 comments of Antonin RAFFIN

related: https://github.com/DLR-RM/stable-baselines3/issues/786 and https://github.com/DLR-RM/stable-baselines3/issues/1384

Hello, thanks for the proposal, achieving full feature parity is not my priority for now (as SBX is meant to be experimental) but I would welcome such PR =)

https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/25b43266e08ebe258061ac69688d94144799de75/sb3_contrib/trpo/trpo.py#L282 not sure for the second question but probably yes

Hello, Related to https://github.com/DLR-RM/stable-baselines3/issues/1059, probably duplicate of https://github.com/DLR-RM/stable-baselines3/issues/457 It is because of how the algorithms work. For short: - PPO/A2C and derivates collect `n_steps * n_envs` of experience before performing...

Hello, it is indeed missing the fix we did a while ago for all other algos: https://github.com/DLR-RM/stable-baselines3/blob/f0382a25bdbed62dcc31875d3e540ad95c1575a5/stable_baselines3/common/base_class.py#L404 i would welcome a PR that fixes this issue =)

hello, what policy are you using? please fill the issue template completely

I wanted to say "policy architecture", it seems that you are not using a CNN if you are using the default hyperparameters... This explains your results.

I would recommend you to read stable-baselines documentation and look at the [rl zoo](https://github.com/araffin/rl-baselines-zoo), you have plenty of examples of RL with images.

Hello, >Did you run into any OOM errors from graph growth when you were running these scripts or do you have any insights? I've never experienced any OOM (I was...

> The results are very unstable, sometimes it seems to work but again after few more episodes, it forgets the learning. Do you have any suggestions for this task ?...