Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

TypeError: learn() got an unexpected keyword argument 'seed'

Hello, This is due to a newer version of Stable-Baselines. The seed argument was moved to the model class constructor. As I'm no longer maintaining this repo, I would appreciate...

[bug report] sac, ddpg, deepq: train.py: error: unrecognized arguments: --policy

Hello, Recurrent policies are not supported by those algorithms. Anyway, it seems also like a legit bug, thanks.

Invalid CarRacingGymEnv-v0

>I'm trying to run the environment CarRacingGymEnv-v0: If you just want to use RL on this environment, please take a look at the [RL Zoo](https://github.com/DLR-RM/rl-baselines3-zoo). Otherwise, if you want to...

[bug report] SRL training mechanism is wrong: validation set is used to update weights during training.

Hello, The gradients are computed but no gradient step is taken: `optimizer.zero_grad()` is called before it happens. However, I agreee that having two dataloaders would be cleaner.

[bug report] Seed must be between 0 and 2**32 - 1; gym version too old

Hello, We would appreciate a PR that solves this issue ;) For a possible fix, see: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/misc_util.py#L93

[bug report (unsolved)/question] SRL training slows down per epoch (Memory leak?).

> also took a look the memory usage, at the beginning it uses around 3GB of GPU memory, but it grows to 5GB at the end of first epoch. Is...

Exporting A Game. How it works?

> I am currently testing using the C# library TorchSharp for loading and running the trained model inside Godot, which will remove the python dependency (once the model is trained)....

Expert datasets

>since most papers used MuJoCo environments instead of pybullet. yes, that's a shame. >Then, do you kindly want to share the expert datasets? or do you want me to run...

Expert datasets

Looks good for Hopper, but the performance is a bit low for HalfCheetah and Ant... Probably due to the time limit... What hyperparameters did you use?

Expert datasets

Well, the time limit will help (see "Influence of the time feature" in appendix [here](https://arxiv.org/abs/2005.05719)) but I would also recommend you to update the hyperparameters (those works but require 2M...