Antonin RAFFIN
Antonin RAFFIN
> For some seeds, it seems that we are reaching a kind of divergence situation. I've experienced that with SAC (and derivates) in the past, there are different solutions: -...
Hello, good to hear =) could you share the hyperparameters used? (my guess is that l2 regularization should be enough, gSDE might not be needed, especially as we are not...
@qgallouedec after some quick trials, I think we should change the default optimizer to `AdamW`. For the figure below, I used: ``` python train.py --algo tqc --env PandaPush-v1 -P --seed...
> How can I draw a graph with shaded parts? that's what `all_plots.py` and `plot_from_file` (I recommend using `rliable`, please read readme and blog post at least) should be able...
> Below the mean reward graph during training: probably related to https://github.com/DLR-RM/stable-baselines3/issues/1063, you should try with stochastic controller at eval time `--stochastic` with the enjoy script. > The environment is...
> i want the model trained on previous volumes to be saved and i load it using setparameters as given in the documentation why are you not using `DDPG.load()` (as...
In your case, the env observation/action space size stay the same, no? so you should probably be using `set_env` where no loading is needed for the agent (unless you want...
> Something like Probably cast to torch tensor automatically (using `from_numpy()`) and output a warning too?
Probably a duplicate of https://github.com/DLR-RM/stable-baselines3/issues/1280
Hello, I think you need to specify `spaces.MultiDiscrete([1, 2])` instead of `spaces.MultiDiscrete([[1,2]])`