Antonin RAFFIN

Results 880 comments of Antonin RAFFIN

>With 4 processes and 3 environments in each, I saw an 18% increase in FPS over 4 processes with 1 environment each. Well, you should compare at least with the...

you can take a look at https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/sb3_contrib/common/vec_env/async_eval.py (here it is async evaluation for the algorithm ARS but the main idea is the same, at the end, it allows to run...

Hello, If you want a robust way to retrieve episode reward variable, you should use a `Monitor` wrapper together with a callback. This is what we do in [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3). In...

Hello, Please fill in the issue template completely (and format the code block / error stack using markdown, there is an example in the template). EDIT: maybe related to https://github.com/hill-a/stable-baselines/issues/603

Hello, next time please use markdown and not a zip file. So minimal code to reproduce the error (I got a different one): ```python import pybullet_envs from stable_baselines.gail import ExpertDataset,...

I think Costa's blog is current the best to have all the implementation details that are in PPO: https://costa.sh/blog-the-32-implementation-details-of-ppo.html But best is also to look at SB3 code now ;)

Hello, Why would you use A2C on such environment instead of using HER+SAC which is much more appropriate? (and can reach 90% success in 20000 timesteps, see zoo https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/hyperparams/her.yml)

Regarding the squashing, best is to read SAC paper. Otherwise, you can take a look at https://github.com/DLR-RM/stable-baselines3/blob/a1e055695c3638f9f15de0cb805b8fcbb5c02764/stable_baselines3/common/distributions.py#L195 or https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/sac/policies.py#L44 to see how to properly replace the Gaussian distribution by squashed...

Hello, >When running a training loop using MPI, the EvalCallback doesn't seem to make use of the parallelisation: yes, the `EvalCallback` does not support MPI parallelization. I would recommend you...

>Does VecEnv parallelise the gradient computation, or just the env part? Just the env part. >I've got PPO + MPI working really well on a multicore machine with a custom...