masterdezign comments

Results 19 comments of


                                            masterdezign

[Feature Request] Implement Recurrent SAC

![Comparison](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/assets/756621/6f03ebd9-c9ea-477f-8910-4005cee15d30) I've got these results on `LunarLanderContinuousNoVel-v2` (rl_zoo3==2.1.0) using RSAC with shared LSTM state ([rsac_s](https://github.com/zhihanyang2022/off-policy-continuous-control/blob/pub/offpcc/algorithms_recurrent/recurrent_sac_sharing.py)) and [RSAC](https://github.com/zhihanyang2022/off-policy-continuous-control/blob/pub/offpcc/algorithms_recurrent/recurrent_sac.py). In both cases, the configuration was the same: ``` # ==================================================================================== # gin...

[Feature Request] Implement Recurrent SAC

> Di you also manage to solve the mountain car problem? I believe, yes. Let me render the env to verify since rewards are not the same for MountainCarContinuousNoVel-v0 (continuous...

[Feature Request] Implement Recurrent SAC

Loosely speaking, here they are: ``` RSAC RSAC_S ┌─────┐ ┌─────┐ ┌─────┐ │ RNN │ │ RNN │ ┌─┤ RNN │.. └──┬──┘ └──┬──┘ │ └─────┘ . │ │ │ . │...

[Feature Request] Implement Recurrent SAC

Update, I just rendered `MountainCarContinuousNoVel-v0` and it is **not** solved yet. I don't quite understand why the total reward is different between the original `MountainCar-v0` env and this one. Therefore,...

[Feature Request] Implement Recurrent SAC

Thanks, I'll check those hyperparameters.

[Feature Request] Implement Recurrent SAC

Indeed, having use_sde=True seems helping to solve `MountainCarContinuous-v0` environment. I am curious which gSDE ingredient does exactly help. Edit: I also tried nearby hyperparameters and indeed gSDE contribution seems to...

[Feature Request] Implement Recurrent SAC

I am currently checking the two strategies for RNN state initialization, proposed in R2D2 paper (store state and burn-in).

[Feature Request] Implement Recurrent SAC

So far I've got this: recurrent replay buffer with overlapping chunks supporting SB3 interface. I also wrote a specification (test) to reduce future surprises. https://gist.github.com/masterdezign/47b3c6172dd1624bb9a7ef23cbc79c8c The limitation is `n_envs =...

[Feature Request] Implement Recurrent SAC

Hi! I didn't obtain good results and then I had to put the project on hold. I plan to restart working on it starting from tomorrow.