stable-baselines3
stable-baselines3 copied to clipboard
[Bug]: Procgen
🐛 Bug
Using Procgen like in the example from the docs website results in:
AssertionError: The algorithm only supports (<class 'gymnasium.spaces.box.Box'>, <class 'gymnasium.spaces.discrete.Discrete'>, <class 'gymnasium.spaces.multi_discrete.MultiDiscrete'>, <class 'gymnasium.spaces.multi_binary.MultiBinary'>) as action spaces but Discrete(15) was provided
To Reproduce
from procgen import ProcgenEnv
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecExtractDictObs, VecMonitor
# ProcgenEnv is already vectorized
venv = ProcgenEnv(num_envs=2, env_name="starpilot")
# To use only part of the observation:
# venv = VecExtractDictObs(venv, "rgb")
# Wrap with a VecMonitor to collect stats and avoid errors
venv = VecMonitor(venv=venv)
model = PPO("MultiInputPolicy", venv, verbose=1)
model.learn(10_000)
Copied from: https://stable-baselines3.readthedocs.io/en/master/guide/examples.html#sb3-and-procgenenv
Relevant log output / Error message
Traceback (most recent call last):
File "/ssd/robos/xb/sb3_procgen_bug.py", line 15, in <module>
model = PPO("MultiInputPolicy", venv, verbose=1)
File "/ssd/robos/xb/venv10/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py", line 109, in __init__
super().__init__(
File "/ssd/robos/xb/venv10/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 85, in __init__
super().__init__(
File "/ssd/robos/xb/venv10/lib/python3.10/site-packages/stable_baselines3/common/base_class.py", line 180, in __init__
assert isinstance(self.action_space, supported_action_spaces), (
AssertionError: The algorithm only supports (<class 'gymnasium.spaces.box.Box'>, <class 'gymnasium.spaces.discrete.Discrete'>, <class 'gymnasium.spaces.multi_discrete.MultiDiscrete'>, <class 'gymnasium.spaces.multi_binary.MultiBinary'>) as action spaces but Discrete(15) was provided
Exception ignored in: <function CEnv.__del__ at 0x7f215900c670>
System Info
- OS: Linux-6.5.0-14-generic-x86_64-with-glibc2.35 # 14~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC
- Python: 3.10.12
- Stable-Baselines3: 2.2.1
- PyTorch: 2.2.0+cu121
- GPU Enabled: True
- Numpy: 1.26.3
- Cloudpickle: 3.0.0
- Gymnasium: 0.29.1
- OpenAI Gym: 0.26.2
Checklist
- [X] My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
- [X] I have checked that there is no similar issue in the repo
- [X] I have read the documentation
- [X] I have provided a minimal and working example to reproduce the bug
- [X] I've used the markdown code blocks for both code and stack traces.
Duplicate of https://github.com/DLR-RM/stable-baselines3/issues/1712
I would appreciate a PR that updates the documentation ;)