Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

[Feature Request] Multiple environments per process

>With 4 processes and 3 environments in each, I saw an 18% increase in FPS over 4 processes with 1 environment each. Well, you should compare at least with the...

[Feature Request] Multiple environments per process

you can take a look at https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/sb3_contrib/common/vec_env/async_eval.py (here it is async evaluation for the algorithm ARS but the main idea is the same, at the end, it allows to run...

Episode rewards not updated before being used by callback.on_step()

Hello, If you want a robust way to retrieve episode reward variable, you should use a `Monitor` wrapper together with a callback. This is what we do in [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3). In...

Problem with running GAIL on HalfCheetah-v2

Hello, Please fill in the issue template completely (and format the code block / error stack using markdown, there is an example in the template). EDIT: maybe related to https://github.com/hill-a/stable-baselines/issues/603

Problem with running GAIL on HalfCheetah-v2

Hello, next time please use markdown and not a zip file. So minimal code to reproduce the error (I got a different one): ```python import pybullet_envs from stable_baselines.gail import ExpertDataset,...

PPO2 implementation details?

I think Costa's blog is current the best to have all the implementation details that are in PPO: https://costa.sh/blog-the-32-implementation-details-of-ppo.html But best is also to look at SB3 code now ;)

[question] Suggested Hyperparams for A2C with highway-env

Hello, Why would you use A2C on such environment instead of using HER+SAC which is much more appropriate? (and can reach 90% success in 20000 timesteps, see zoo https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/hyperparams/her.yml)

[question] unstable actions in PPO

Regarding the squashing, best is to read SAC paper. Otherwise, you can take a look at https://github.com/DLR-RM/stable-baselines3/blob/a1e055695c3638f9f15de0cb805b8fcbb5c02764/stable_baselines3/common/distributions.py#L195 or https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/sac/policies.py#L44 to see how to properly replace the Gaussian distribution by squashed...

[question] EvalCallback using MPI

Hello, >When running a training loop using MPI, the EvalCallback doesn't seem to make use of the parallelisation: yes, the `EvalCallback` does not support MPI parallelization. I would recommend you...

[question] EvalCallback using MPI

>Does VecEnv parallelise the gradient computation, or just the env part? Just the env part. >I've got PPO + MPI working really well on a multicore machine with a custom...