Antonin RAFFIN
Antonin RAFFIN
Hello, This is due to a newer version of Stable-Baselines. The seed argument was moved to the model class constructor. As I'm no longer maintaining this repo, I would appreciate...
Hello, Recurrent policies are not supported by those algorithms. Anyway, it seems also like a legit bug, thanks.
>I'm trying to run the environment CarRacingGymEnv-v0: If you just want to use RL on this environment, please take a look at the [RL Zoo](https://github.com/DLR-RM/rl-baselines3-zoo). Otherwise, if you want to...
Hello, The gradients are computed but no gradient step is taken: `optimizer.zero_grad()` is called before it happens. However, I agreee that having two dataloaders would be cleaner.
Hello, We would appreciate a PR that solves this issue ;) For a possible fix, see: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/misc_util.py#L93
> also took a look the memory usage, at the beginning it uses around 3GB of GPU memory, but it grows to 5GB at the end of first epoch. Is...
> I am currently testing using the C# library TorchSharp for loading and running the trained model inside Godot, which will remove the python dependency (once the model is trained)....
>since most papers used MuJoCo environments instead of pybullet. yes, that's a shame. >Then, do you kindly want to share the expert datasets? or do you want me to run...
Looks good for Hopper, but the performance is a bit low for HalfCheetah and Ant... Probably due to the time limit... What hyperparameters did you use?
Well, the time limit will help (see "Influence of the time feature" in appendix [here](https://arxiv.org/abs/2005.05719)) but I would also recommend you to update the hyperparameters (those works but require 2M...