Antonin RAFFIN comments

Results 880 comments of


                                            Antonin RAFFIN

Update Gymnasium to v1.0.0

> I don't know how SB3 works internals, would you be able to get one of your devs to update that? Currently, there is only one active dev (me...), Quentin...

PPO doesn't work with MultiDiscrete observation space

Note: the env checker must be updated to warn users that we don't support multi-dim multi discrete and propose a fix (the one from @qgallouedec ).

ValueError: could not broadcast input array from shape (23,) into shape (27,)

you may give https://github.com/DLR-RM/stable-baselines3/pull/1837 a try then.

[Bug]: evaluate_policy called multiple times vor vectorized environments

Hello, what is your usecase/expected behavior? the for loop also decompose the info per env: https://github.com/DLR-RM/stable-baselines3/blob/35eccaf04fa011128f02eaecac6caab535686459/stable_baselines3/common/evaluation.py#L99-L106

[Bug]: evaluate_policy called multiple times vor vectorized environments

there is the local variable "i"

[Bug]: evaluate_policy called multiple times vor vectorized environments

> A documentation of locals and globals would probably help to find that! :) feel free to open a PR that updates the doc ;)

Fix memory leak in base_class.py

> Loading the data causes a memory leak through the ep_info_buffer variable. Do you have a minimal example to reproduce/track this behavior? also, how big is the leak?

Fix memory leak in base_class.py

> Is there anything I should still do for merger? From your side, nothing for now ;) What is missing is from my side. I need to take some time...

Fix memory leak in base_class.py

I took the time to look at your example closely but I don't understand `model.ep_info_buffer.extend([torch.ones(10000,device="cuda:2")])`, this is not supposed to contain any torch variable. Btw, PPO is usually faster when...

SubprocVecEnv Sets Out-of-Range Seeds for My Environments (ScenarioNet Enviroment)

hello, i think there is a misconception between seed, used for pseudo random generator and scenarios.