Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

Plotting Documentation

Hello, > I'll summarize it before I start working on the documentation itself. thanks =) your description is right: - the `result_plotter` is mostly interesting for its `ts2xy` and `window_func`,...

AtariWrapper does not use recommended defaults

> What about sticky actions? you mean its influence on performance? I don't know, I think the main issue is that the changes were made without benchmark (in the paper,...

AtariWrapper does not use recommended defaults

> TL;DR, sticky actions are the recommended way to prevent agents from abusing determinism, not a way to improve rewards. thanks for you answer =) My question was not about...

AtariWrapper does not use recommended defaults

> Is the issue that people are using existing SB3 results in their papers, and might mistakenly attribute the charts that you have now yes, that's the issue. And not...

AtariWrapper does not use recommended defaults

> Would it be possible to set up a system for people to contribute individual runs? yes =D, that's the whole point of the openrl benchmark initiative by @vwxyzjn (best...

AtariWrapper does not use recommended defaults

Linking discussion with @JesseFarebro as it's relevant to that issue: https://github.com/DLR-RM/stable-baselines3/pull/572#issuecomment-993701078 (also relevant: https://github.com/DLR-RM/stable-baselines3/pull/734) EDIT: it seems that gymnasium documentation is outdated

VecHerReplayBuffer

Closing in favor of https://github.com/DLR-RM/stable-baselines3/pull/704

[Feature Request] API for using custom distributions with existing policies

Alternative solution from https://github.com/martius-lab/pink-noise-rl ```python # Initialize agent model = SAC("MlpPolicy", env) # Set action noise model.actor.action_dist = PinkNoiseDist(action_dim, seq_len) # Train agent model.learn(total_timesteps=10_000) ```

[Question] Multi Output Policy Support?

It seems that @adysonmaia implemented PPO with dict action space support here: https://github.com/adysonmaia/sb3-plus/blob/main/sb3_plus/mimo_ppo/ppo.py#L24

[Feature Request] RAINBOW

> Any updates about the rainbow implementation? Contributions are welcomed ;) (if you do so, please read the contributing guide from SB3-Contrib, it explains how to test new algorithms) It...