Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

[Question] Would you like a pull request implementing classical tabular RL algorithms ?

related: https://github.com/DLR-RM/stable-baselines3/issues/786 and https://github.com/DLR-RM/stable-baselines3/issues/1384

Adding Feature Extractors

Hello, thanks for the proposal, achieving full feature parity is not my priority for now (as SBX is meant to be experimental) but I would welcome such PR =)

[Question] What is the difference between old_distribution and distribution in train function of TRPO

https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/25b43266e08ebe258061ac69688d94144799de75/sb3_contrib/trpo/trpo.py#L282 not sure for the second question but probably yes

Training exceeds total_timesteps

Hello, Related to https://github.com/DLR-RM/stable-baselines3/issues/1059, probably duplicate of https://github.com/DLR-RM/stable-baselines3/issues/457 It is because of how the algorithms work. For short: - PPO/A2C and derivates collect `n_steps * n_envs` of experience before performing...

MaskablePPO doesn't set _num_timesteps_at_start so fps count is wrong when reset_num_timesteps=False in learn()

Hello, it is indeed missing the fix we did a while ago for all other algos: https://github.com/DLR-RM/stable-baselines3/blob/f0382a25bdbed62dcc31875d3e540ad95c1575a5/stable_baselines3/common/base_class.py#L404 i would welcome a PR that fixes this issue =)

Training SAC with raw image as input

hello, what policy are you using? please fill the issue template completely

Training SAC with raw image as input

I wanted to say "policy architecture", it seems that you are not using a CNN if you are using the default hyperparameters... This explains your results.

Training SAC with raw image as input

I would recommend you to read stable-baselines documentation and look at the [rl zoo](https://github.com/araffin/rl-baselines-zoo), you have plenty of examples of RL with images.

OOM error when training VAE

Hello, >Did you run into any OOM errors from graph growth when you were running these scripts or do you have any insights? I've never experienced any OOM (I was...

Training for collision avoidance of small car

> The results are very unstable, sometimes it seems to work but again after few more episodes, it forgets the learning. Do you have any suggestions for this task ?...