Antonin RAFFIN
Antonin RAFFIN
> Yet it results in the same code run twice What do you mean? Sounds like it is expected, the synchronization is done when computing the gradients.
>the state of recurrent units such as LSTM are part of the environment (aka universe) state. I would rather say that the state of the LSTM, which is in fact...
> but would you still be interested in a PR once im done? This feature should be a `gym.Wrapper` so independent of the backend. >This could be added in the...
>could you please provide an example of how to compute the sample efficiency of an RL algorithm? It looks like both the `Monitor` wrapper and `EvalCallback` should do the trick...
Hello, Thanks for reporting the issue. I tried the following code (note the `learning_starts=0` to avoid wrong estimation of the FPS) ```python import time from stable_baselines import HER, SAC from...
I don't have the time to deal with this issue now, but you could use [line profiler](https://github.com/rkern/line_profiler) to check what is taking so much time.
Thanks @tirafesi , I assume that replacing the list-based replay buffer by numpy-based replay buffer would solve the issue... You have an example of it in the tf2 draft: https://github.com/Stable-Baselines-Team/stable-baselines-tf2/blob/master/stable_baselines/common/buffers.py...
@toksis I will delete the comments as it is not related to stable-baselines nor this issue but to the line-profiler
Apparently, the problem is solved in v3: https://github.com/hill-a/stable-baselines/issues/845#issuecomment-639754121 because of the new replay buffer implementation.
Hello, please fill up the issue template completely.