Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

[Question] Inconsistent training of Panda manipulation tasks

> For some seeds, it seems that we are reaching a kind of divergence situation. I've experienced that with SAC (and derivates) in the past, there are different solutions: -...

[Question] Inconsistent training of Panda manipulation tasks

Hello, good to hear =) could you share the hyperparameters used? (my guess is that l2 regularization should be enough, gSDE might not be needed, especially as we are not...

[Question] Inconsistent training of Panda manipulation tasks

@qgallouedec after some quick trials, I think we should change the default optimizer to `AdamW`. For the figure below, I used: ``` python train.py --algo tqc --env PandaPush-v1 -P --seed...

Some questions about Plotting

> How can I draw a graph with shaded parts？ that's what `all_plots.py` and `plot_from_file` (I recommend using `rliable`, please read readme and blog post at least) should be able...

[Question] Custom environment: First episode in evaluation always performs poorly

> Below the mean reward graph during training: probably related to https://github.com/DLR-RM/stable-baselines3/issues/1063, you should try with stochastic controller at eval time `--stochastic` with the enjoy script. > The environment is...

what is the proper way to train model with model loading

> i want the model trained on previous volumes to be saved and i load it using setparameters as given in the documentation why are you not using `DDPG.load()` (as...

what is the proper way to train model with model loading

In your case, the env observation/action space size stay the same, no? so you should probably be using `set_env` where no loading is needed for the agent (unless you want...

Add np.ndarray as a recognized type for TB histograms.

> Something like Probably cast to torch tensor automatically (using `from_numpy()`) and output a warning too?

[Feature Request] GraphFeatureExtractor

Probably a duplicate of https://github.com/DLR-RM/stable-baselines3/issues/1280

Dict Observation Space with MultiDiscrete Action Space Issue

Hello, I think you need to specify `spaces.MultiDiscrete([1, 2])` instead of `spaces.MultiDiscrete([[1,2]])`