Antonin RAFFIN comments

Results 769 comments of


                                            Antonin RAFFIN

[Bug] env.reset() does not reset an environment for Atari

> But if I run your code with terminal_on_life_loss=True one can see that env.reset() is actually perform env.step(). This behaviour looks odd. I agree and at the same time, the...

Crash when extending VecEnvWrapper

Hello, to reproduce your issue, you need to add `import ipdb; ipdb.set_trace()` before the call to `super()` and then enter `self.this_attr_does_not_exist` for instance in the command line. If you put...

Crash when extending VecEnvWrapper

I just tested with pycharm with the provided code and could not reproduce the issue... (setting a breakpoint at `self.key = key`) ```python import numpy as np from stable_baselines3.common.vec_env.base_vec_env import...

VecHerReplayBuffer

> Hi, any updates regarding this PR? I am working on a project that will like to use VecHerReplayBuffer. I would welcome help on that PR. You can try it...

VecHerReplayBuffer

> Hi, I would be interested too! What kind of polishing would the PR need? First of all, more tests, especially for the saving/loading part of the buffer. Then print...

VecHerReplayBuffer

Hello @geyang, as mentioned in the doc (multi env with off policy algo example), you probably need to update `gradient_steps` variable to match the number of envs

VecHerReplayBuffer

> Having a single class that handles both cases would avoid having to make the following distinctions: I agree, that would be awesome if we had one class that handle...

Multiprocessing support for HerReplayBuffer

Thanks for creating the PR =) This is in fact the cleaner way to do it that I had in mind but no time to invest in... What is missing?...

Multiprocessing support for HerReplayBuffer

> the results obtained are different depending on the number of workers, which is not desirable (see curve). you need to update `gradient_steps` (cf. doc, same as https://github.com/DLR-RM/stable-baselines3/issues/699) > I...

Multiprocessing support for HerReplayBuffer

> Currently working on a new implementation of the HerReplayBuffer. You mean for both offline and online sampling? The main reason for this implementation is efficiency. btw, in the worst...