Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

PPORecurrent mini batch size inconsistent

> Yes, I tried on gym BipedalWalker v3, which went from 752 fps to 1992 fps, a 2.6 times speedup: > If you're interested I can run more tests with...

PPORecurrent mini batch size inconsistent

thanks for the additional results =), could you open a draft PR? (that would make it easier for me to review the code)

PPORecurrent mini batch size inconsistent

Thanks for the PR, I think I finally understand why it works. This is indeed a nice way to accelerate PPO LSTM. I have some remarks though: - you are...

PPORecurrent mini batch size inconsistent

> However, it will drop the final batch in an epoch if n_batches mod batch_size != 0 to prevent very small batches from causing updates. I'm not sure to understand...

PPORecurrent mini batch size inconsistent

> For example if it tries to sample batches of 8 sequences out of a total of 81 sequences, it will result in 10 full batches of 8 sequences, and...

PPORecurrent mini batch size inconsistent

> No further work was done here? I'm curious to see here progress on the main sb3 contrib integration. Please have a look at https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/pull/118 Help and further testing is...

[question] How do I load a tensorflow ckpt?

we also highly recommend to switch to Stable-Baselines3 (PyTorch).

[Question]Callback collected model does not have same reward as training verbose[custom gym environment]

Hello, this is SB2 repo, not SB3. Anyway, please give a minimal code example to reproduce the issue and you can also search for similar issues on the repo. During...

[Question] Things to do to reproduce the evaluation results in hyperparameters optimization

> is quite different from the performance (mean reward: -1.7e6) reported in the phase of hyperparameter optimization Probably duplicate of https://github.com/DLR-RM/rl-baselines3-zoo/issues/314#issuecomment-1316907265 and https://github.com/DLR-RM/rl-baselines3-zoo/issues/204

[Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter

Probably similar to https://github.com/DLR-RM/stable-baselines3/issues/1295, we need to update the env checker edit: correct issue is https://github.com/DLR-RM/stable-baselines3/issues/913#issuecomment-1129537155