Antonin RAFFIN

Results 880 comments of Antonin RAFFIN

Hello, > Am I correct in understanding that when running models learned with gSDE, if the user wants the same non-deterministic behaviour as at the end of learning, the user...

Hello, have you considered callbacks as an alternative? (see doc and section on tensorboard) they should allow you to log every n steps or every k iterations.

Adding a callback to do step based logging to the collection provided by sb3 would be a good addition i think. Logging more often shouldn't be a problem (or solvable...

> Could you elaborate on your vision for this? Have a simple `LogEveryNSteps` callback that calls `self.logger.dump()` every n steps (n calls to `env.step()`, it might correspond to more than...

Hello, I guess the issue comes from `evaluate_policy()` where should call `render()` before the call to `predict()`. For the final, I'm not sure as with VecEnv, they are reset automatically.

> I'll have to create a check in the step function to make sure that step is never called with a done environment > Side note, there's another discussion to...

> evaluate_policy has the capacity to support multiple envs, but currently hard codes just one. What makes you think that? > Pseudocode so if you want to do something like...

> Ah, I guess a user could pass a VecEnv that is already instantiated with multiple environments. I missed this in the documentation. yes > Are you proposing a EvalGymWrapper...

> If I'm requesting a new feature, I have proposed alternatives

Partial duplicate of https://github.com/DLR-RM/stable-baselines3/issues/1568#issuecomment-1600595147 and https://github.com/DLR-RM/stable-baselines3/issues/229 For short: a `VecEnvWrapper` would be indeed a good idea but only after gymnasium 1.0 is released and fully tested. Would you be willing...