Antonin RAFFIN
Antonin RAFFIN
tagging myself @araffin for review (later this week probably)
> This will also require to use a joint loss total_loss = actor_loss + qf_loss so that .backward() is called only once you could detach `action_probs` too and no joint...
Fyi, I pushed and released on pypi a subset of pybullet envs compatible with gymnasium: https://github.com/araffin/pybullet_envs_gymnasium `pip install pybullet_envs_gymnasium` (I will crosspost this message as it may interest several people)
Done in https://github.com/DLR-RM/stable-baselines3/pull/1143 (for mypy)
I'm closing this one as we are now only relying on mypy. I decided to not check the test for now because we are doing too many funky things (for...
Closing in favor https://github.com/DLR-RM/stable-baselines3/pull/1292 for reference, if one wants to have different activation functions for actor vs critic: ```python from typing import Callable, Tuple from gym import spaces from stable_baselines3...
> what about re-proposing thei idea of https://github.com/DLR-RM/stable-baselines3/pull/1116 - i.e. pass the desired activation functions to the constructor when creating the model? I think defining a custom policy is now...
> hould I do a PR for this or would you prefer a separate repository for this? I would prefer a separate repo but we would put a link to...
> Update newest recurrentmaskable PPO on https://github.com/wdlctc/recurrent_maskable based on sb3-contrib version 1.8.0 Thanks for the update, do you mean it now works with current SB3 master version?
> @araffin have you had situations where correctly handling truncation limit has made a different in the performance? Just run SAC/PPO on pybullet with/without time feature/handling of timeout, you will...