Antonin RAFFIN comments

Results 769 comments of


                                            Antonin RAFFIN

Implement sampling and training asynchronously using the SAC algorithm

> I didn't know you could do that with callbacks. yes, callbacks & wrappers are quite powerful... > It would be interesting if it was documented. Well, I didn't have...

Implement sampling and training asynchronously using the SAC algorithm

Hello, I'm actively using the callback, the important thing to check is comparing training with the same number of gradient updates. And also comparing how long does it take to...

[feature request] Low-level API

Hello, Why not, but I would say you already have one: >`action = agent.act(state)` It is called `predict()`: `action, _ = agent.predict(state)` >`agent.record(state, action, next_state, reward, done, info)` As mentioned...

[feature request] Low-level API

>Are these available in SB2 as well? Yes for the predict (cf doc). For the rest, more or less, it is a bit more messy. The `train()` corresponds to the...

[feature request] Low-level API

Different updates regarding this issue: > Agreed, this question came up in sb repository quite often. Another related thing we could do is to make getting action probabilities/values bit easier...

AtariWrapper does not use recommended defaults

Hello, > This goes against the recommendations of Revisiting the Arcade Learning Environment (https://arxiv.org/pdf/1709.06009.pdf). Yes, I'm aware of that. We kept it to be able to compare results against SB2....

AtariWrapper does not use recommended defaults

> The current AtariWrapper by default has `terminate_on_life_loss` set to True. This goes against the recommendations of Revisiting the Arcade Learning Environment (https://arxiv.org/pdf/1709.06009.pdf). I believe this should be set to...

set_parameters() listed as load_parameters() in documentation.

sounds good =)

[Feature-request] Allow auxiliary tasks in policies through a unified interface

Related: https://github.com/hill-a/stable-baselines/issues/463 I need to think more about it, but for now I would prefer that users define custom policies and train methods (related to #55 though) rather than changing...

[Bug] env.reset() does not reset an environment for Atari

Hello, > Observation after env.reset() should be the same, i.d. Image 1 should be equal to Image 2 why? Calling `reset()` means starting a new episode, so you have the...