Antonin RAFFIN
Antonin RAFFIN
Thanks for the swift answer =) > we ablated over different utds and found that utd=20 works best. See this [figure](https://github.com/ikostrikov/walk_in_the_park/files/9440172/utd.pdf). given how fast is the implementation, it would make...
> In the wild, we were constrained by the battery capacity :) With more training it gets better and better. Alright... still curious to see what it could do in...
As a follow up, I've got a working version of TQC + DroQ in jax here (borrowed some code from your implementation ;)): https://github.com/vwxyzjn/cleanrl/pull/272 (also a version of TQC +...
no no, they trained on the real robot only, i managed to reproduce the experiment: https://araffin.github.io/slides/design-real-rl-experiments/#/13/2
Yes, Firefox relies on web extension too ;) (even though there is part of the api specific to each browser) "The technology for extensions in Firefox is, to a large...
> Simplify hyperparameter tuning. We always take the number of gradient steps as a fraction of the training frequency. (See [rl-zoo code](https://github.com/DLR-RM/rl-baselines3-zoo/blob/27e081eb24419ee843ae1c329b0482db823c9fc1/rl_zoo3/hyperparams_opt.py#L421)) the code in the RL Zoo is only...
> tensors of the shape (batch_size, seq_len, feature_dim) as input to the LSTM ( that's correct I think you missed: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/25b43266e08ebe258061ac69688d94144799de75/sb3_contrib/common/recurrent/policies.py#L189-L194 here we pass a full sequence as input. and...
Thanks a lot for the implementation =) I'll try later in the week, but how is it in term of runtime? (SAC vs CrossQ in PyTorch)
I'm suspecting something is wrong with the current implementation (I'm currently investigating if it is my changes or not). My setting: ```yaml BipedalWalker-v3: n_timesteps: !!float 2e5 policy: 'MlpPolicy' buffer_size: 300000...
> Did you figure out what the issue is? I was at ICRA until last week so I didn't have time but if you didn't find it yet I can...