masterdezign
masterdezign
data:image/s3,"s3://crabby-images/15217/15217fab646bdba4f61da8c253e917ca31fe07ed" alt="Comparison" I've got these results on `LunarLanderContinuousNoVel-v2` (rl_zoo3==2.1.0) using RSAC with shared LSTM state ([rsac_s](https://github.com/zhihanyang2022/off-policy-continuous-control/blob/pub/offpcc/algorithms_recurrent/recurrent_sac_sharing.py)) and [RSAC](https://github.com/zhihanyang2022/off-policy-continuous-control/blob/pub/offpcc/algorithms_recurrent/recurrent_sac.py). In both cases, the configuration was the same: ``` # ==================================================================================== # gin...
> Di you also manage to solve the mountain car problem? I believe, yes. Let me render the env to verify since rewards are not the same for MountainCarContinuousNoVel-v0 (continuous...
Loosely speaking, here they are: ``` RSAC RSAC_S ┌─────┐ ┌─────┐ ┌─────┐ │ RNN │ │ RNN │ ┌─┤ RNN │.. └──┬──┘ └──┬──┘ │ └─────┘ . │ │ │ . │...
Update, I just rendered `MountainCarContinuousNoVel-v0` and it is **not** solved yet. I don't quite understand why the total reward is different between the original `MountainCar-v0` env and this one. Therefore,...
Thanks, I'll check those hyperparameters.
Indeed, having use_sde=True seems helping to solve `MountainCarContinuous-v0` environment. I am curious which gSDE ingredient does exactly help. Edit: I also tried nearby hyperparameters and indeed gSDE contribution seems to...
I am currently checking the two strategies for RNN state initialization, proposed in R2D2 paper (store state and burn-in).
So far I've got this: recurrent replay buffer with overlapping chunks supporting SB3 interface. I also wrote a specification (test) to reduce future surprises. https://gist.github.com/masterdezign/47b3c6172dd1624bb9a7ef23cbc79c8c The limitation is `n_envs =...
Hi! I didn't obtain good results and then I had to put the project on hold. I plan to restart working on it starting from tomorrow.