Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

Recalculate Returns and Advantages After Callback to Ensure Reward Consistency (common/on_policy_algorithm.py)

> I have raised an issue to propose this change ([required](https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md) for new features and bug fixes) this step is important to discuss the issue and see if a fix/feature...

[Bug]: Episode start flag is never set for off policy algorithms

Hello, that's correct because there is current only `RecurrentPPO` that make use of `states` (LSTM states) and episode starts (to reset the states).

[Feature Request] Group Relative Proximity Optimization (GRPO)

To continue the discussion from SB3 issue, I had a closer look at GRPO. The algorithm was actually described in https://arxiv.org/abs/2402.03300 and seems to be specific to LLM training (which...

[Feature Request] Support for multi input policies in CrossQ

Hello, thanks for the proposal, what is missing currently? where would you need some help?

[Feature Request] Dict Obs Spaces Support

Hello, SBX has a limited support for it (when all the items are boxes). > do you have this feature in your roadmap? If yes when is it expected to...

[Bug]: Video upload to wandb broken since 2.4.0

Hello, could you provide the error message too?

[Bug]: Video upload to wandb broken since 2.4.0

Do you see any difference in the filenames/logs compared to SB3

[Bug]: Video upload to wandb broken since 2.4.0

might be related to https://github.com/DLR-RM/stable-baselines3/issues/2061 help is welcomed to solve the issue =)

[Bug]: Video upload to wandb broken since 2.4.0

@OliverUrbann could you try with https://github.com/DLR-RM/stable-baselines3/pull/2063 ? it might solve your issue

[Bug]: Video upload to wandb broken since 2.4.0

thanks for trying =) I've dig more into the issue and I think I found the root cause. The problem comes from W&B client: https://github.com/wandb/wandb/blob/8dd25cab52da3603022e75322c847de4def21b1c/wandb/integration/gym/__init__.py#L68 With Gymnasium v1.0, the previous...