Antonin RAFFIN
Antonin RAFFIN
Hello, what you are proposing only works for tensorboard logger, right? what do you propose as behavior for the other loggers? > Getting through the SummaryWriter to the writer Sounds...
Hello, there are two differences that I know: - PPO in SB3 handles timeout properly: https://github.com/DLR-RM/stable-baselines3/issues/1355 - deprecated value clipping (not used by default, not recommended) in SB3 PPO is...
Hello, why would you do instead of using a callback for instance? I'm also wondering why you would recreate the environment every time instead of just calling `learn(..., reset_num_timesteps=False)` (see...
> depends on some initial seed/state, which I can use to simulate ~"unseen" data and test generalization `.reset(seed=...)` is made for that normally (`.seed()` for `VecEnv` and then do a...
> But maybe it's a good addition and sometimes replacement for LSTM/RNN ? that would be more for SB3 contrib I guess. And without any benchmark, it's hard to say...
Closing as no benchmark was provided to support the feature request. Feel free to re-open if you manage to have some quantitative results.
Hello, thanks for the PR, but please only keep the typo fixes changes for now.
related: https://github.com/DLR-RM/stable-baselines3/issues/545
Hello, for integrating SRL with SB3, you can have a look at https://github.com/araffin/aae-train-donkeycar/blob/master/ae/wrapper.py (a VecEnv wrapper would be better when using multiple envs). For the noise, please have a look...
@vmoens is that a regression? (compared to #1438)