Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

SAC jax

> Interestingly, the paper seems to say our implementation should have been the following (with the summation) Not sure to follow the difference... You can take a look at how...

@vwxyzjn run the code with `JAX_ENABLE_X64=True` and it will solve your issue ;) (results are still slightly different, but that's probably expected, try with different random seeds) `JIT_DISABLE_JIT=1` already partially...

SAC jax

> run the code with JAX_ENABLE_X64=True and it will solve your issue ;) (results are still slightly different, but that's probably expected, try with different random seeds) @vwxyzjn as a...

SAC jax

@vwxyzjn I would need your help again to update the lockfile, I tried to do it locally and poetry destroyed my conda env...

Cannot import pybullet_envs

You can patch gym as done in the RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo/pull/256 The file: https://github.com/DLR-RM/rl-baselines3-zoo/blob/feat/gym-0.24/rl_zoo3/gym_patches.py and then use `apply_api_compatibility=True` when creating the env. (soon you will be able to apply the...

Cannot import pybullet_envs

> This requires a fix in Bullet? Is there a run-time version check of gym, so we can implement both old and new method? This PR should already fix it:...

Cannot import pybullet_envs

Fyi, I pushed and released on pypi a subset of pybullet envs compatible with gymnasium: https://github.com/araffin/pybullet_envs_gymnasium `pip install pybullet_envs_gymnasium` (I will crosspost this message as it may interest several people)

Convert code to support Gymnasium instead of Gym

Create optimizer in `OnPolicyAlgorithm` only after the device is set

https://github.com/DLR-RM/stable-baselines3/issues/1770#issuecomment-1849632108

[Feature Request] independently configurable learning rates for actor and critic

Hello, >independently configurable learning rates for actor and critic in AC-style algorithms so you are proposing that for DDPG, TD3 and SAC? (it does not apply to PPO/A2C as the...