stable-baselines3 Scaling Environment

trafficstars

🐛 Bug

check_env result

Traceback (most recent call last): File "D:\Thesis_\Test\PPonew.py", line 461, in check_env(env) File "C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\env_checker.py", line 409, in check_env assert isinstance( AssertionError: Your environment must inherit from the gymnasium.Env class cf. https://gymnasium.farama.org/api/env/

I have trained and tested a custom Boid flocking environment, in OpenAI Gym, using 3 Boids it works. However when I test it, the policy, for greater than that e.g 10, it gives the following error.

Code example

Download Code: https://drive.google.com/drive/folders/1c0t-7D5RWumtLY9Bh9kht6P3RAQ4mq0V?usp=sharing
Download Model: https://drive.google.com/file/d/1wrBZ6mSrcaxrWERgvUYA_vLDaA0JL08f/view?usp=drive_link
Place the model with code
cd into where code is and run command:


class FlockingEnv(gym.Env):
    def __init__(self):
        super(FlockingEnv, self).__init__()
        self.episode=0
        self.CTDE=False
        self.current_timestep = 0 
        self.reward_log = []
        self.counter=0

        self.agents = [Agent(position) for position in self.read_agent_locations()]
        
        min_action = np.array([-5, -5] * len(self.agents), dtype=np.float32)
        max_action = np.array([5, 5] * len(self.agents), dtype=np.float32)
        self.action_space = spaces.Box(low=min_action, high=max_action, dtype=np.float32)

        min_obs = np.array([[-np.inf, -np.inf, -2.5, -2.5]] * len(self.agents), dtype=np.float32)
        max_obs = np.array([[np.inf, np.inf, 2.5, 2.5]] * len(self.agents), dtype=np.float32)
        self.observation_space = spaces.Box(low=min_obs, high=max_obs, dtype=np.float32)


    def step(self, actions):
        #   # Add Gaussian noise to actions
        noisy_actions = actions + np.random.normal(loc=0, scale=0.01, size=actions.shape)
        
        #Clip actions to action space bounds
        noisy_actions = np.clip(noisy_actions, self.action_space.low, self.action_space.high)

        # if(self.current_timestep % 400000 == 0):
        #     print(self.current_timestep)
        #     self.counter = self.counter + 1
        #     print("Counter", self.counter)

        self.current_timestep+=1
        reward=0
        done=False
        info={}

        observations = self.simulate_agents(actions)
        reward, out_of_flock = self.calculate_reward()
        
        #Validate this
        if (self.CTDE==False):
            # Terminal Conditions
            for agent in self.agents:
                if((self.check_collision(agent)) or (out_of_flock==True)):
                    done=True
                    env.reset()


        # if self.CTDE:

        #     log_path = os.path.join(Files['Flocking'], 'Testing', 'Rewards', 'Components', f"Episode{episode}")
        #     log_path = os.path.join(log_path, "Reward_Total_log.json")

        
        #     with open(log_path, 'a') as f:
        #         json.dump((round(reward, 2)), f, indent=4)
        #         f.write('\n')

        
        self.current_timestep = self.current_timestep + 1

        return observations, reward, done, info

    def close(self):
        print("Environment is closed. Cleanup complete.")
        
        #Does velocity make a difference
        #Observation Space
 
def reset(self):
        # seed_everything(SimulationVariables["Seed"])
        env.seed(SimulationVariables["Seed"])
        self.agents = [Agent(position) for position in self.read_agent_locations()]
        for agent in self.agents:
            agent.acceleration = np.round(np.random.uniform(-SimulationVariables["AccelerationInit"], SimulationVariables["AccelerationInit"], size=2), 2)
            agent.velocity = agent.acceleration * SimulationVariables["dt"]               
        observation = self.get_observation()
        return observation

Relevant log output / Error message

> PS D:\Test> python PPonew.py
> 2024-04-23 12:21:14.425966: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
> 2024-04-23 12:21:15.190142: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
> C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\save_util.py:166: UserWarning: Could not deserialize object clip_range. Consider using `custom_objects` argument to replace this object.
> Exception: Can't get attribute '_make_function' on <module 'cloudpickle.cloudpickle' from 'C:\\Users\\Cr7th\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\cloudpickle\\cloudpickle.py'>
>   warnings.warn(
> C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\save_util.py:166: UserWarning: Could not deserialize object lr_schedule. Consider using `custom_objects` argument to replace this object.
> Exception: Can't get attribute '_make_function' on <module 'cloudpickle.cloudpickle' from 'C:\\Users\\Cr7th\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\cloudpickle\\cloudpickle.py'>
>   warnings.warn(
>   0%|                                                                                                                                                                                                           | 0/5 [00:00<?, ?it/s]Episode: 0
>   0%|                                                                                                                                                                                                           | 0/5 [00:00<?, ?it/s] 
> Traceback (most recent call last):
>   File "D:\Thesis_\Test\PPonew.py", line 499, in <module>
>     action, state = model.predict(obs)
>                     ^^^^^^^^^^^^^^^^^^
>   File "C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\base_class.py", line 553, in predict
>     return self.policy.predict(observation, state, episode_start, deterministic)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\policies.py", line 363, in predict
>     obs_tensor, vectorized_env = self.obs_to_tensor(observation)
>                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\policies.py", line 270, in obs_to_tensor
>     vectorized_env = is_vectorized_observation(observation, self.observation_space)
>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\utils.py", line 399, in is_vectorized_observation
>     return is_vec_obs_func(observation, observation_space)  # type: ignore[operator]
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "C:\Users\Cr7th\AppData\Local\Programs\Python\Python311\Lib\site-packages\stable_baselines3\common\utils.py", line 266, in is_vectorized_box_observation
>     raise ValueError(

> ValueError: Error: Unexpected observation shape (10, 4) for Box environment, please use (3, 4) or (n_env, 3, 4) for the observation shape.

System Info

OS: Windows-10-10.0.22631-SP0 10.0.22631 Python: 3.11.4 Stable-Baselines3: 2.2.1 PyTorch: 2.0.0+cpu GPU Enabled: False Numpy: 1.23.5 Cloudpickle: 1.2.2 Gymnasium: 0.29.1 OpenAI Gym: 0.15.7

({'OS': 'Windows-10-10.0.22631-SP0 10.0.22631', 'Python': '3.11.4', 'Stable-Baselines3': '2.2.1', 'PyTorch': '2.0.0+cpu', 'GPU Enabled': 'False', 'Numpy': '1.23.5', 'Cloudpickle': '1.2.2', 'Gymnasium': '0.29.1', 'OpenAI Gym': '0.15.7'}, '- OS: Windows-10-10.0.22631-SP0 10.0.22631\n- Python: 3.11.4\n- Stable-Baselines3: 2.2.1\n- PyTorch: 2.0.0+cpu\n- GPU Enabled: False\n- Numpy: 1.23.5\n- Cloudpickle: 1.2.2\n- Gymnasium: 0.29.1\n- OpenAI Gym: 0.15.7\n')

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] I have provided a minimal and working example to reproduce the bug
[X] I have checked my env using the env checker
[X] I've used the markdown code blocks for both code and stack traces.

Apr 23 '24 09:04 Hamza-101

Please have a careful look at https://github.com/DLR-RM/stable-baselines3/issues/982#issuecomment-1197044014

AssertionError: Your environment must inherit from the gymnasium.Env

Please fix any issue found by the env checker before posting an issue about custom env.

Apr 23 '24 10:04 araffin

The prompt asked to give minimal viable working code which is why I gave the steps. Giving these 4 functions was also written in details box, which is why I attached it.

My situation is simlar to No error in inheriting Just wants me to switch to gynasium, can't cause it wouldn't install tried multiple times on different os.

So it can be ignored.

Apr 23 '24 11:04 Hamza-101

What do you mean "it wouldn't install"?

minimal viable working code which is why I gave the steps

It far from being minimal. Providing a MRE would help

Apr 27 '24 13:04 qgallouedec

What do you mean "it wouldn't install"?

box2d wouldn't install whenever I tried to setup gymnasium. Tried many times, manually and otherwise.

It far from being minimal. Providing a MRE would help

This is the "minimal" code I could provide while keeping it working.

Apr 27 '24 17:04 Hamza-101

Gymnasium and box2d are maintained, if you can't install it, open an issue on its GitHub to sort this out. Plus it seems like you've installed it:

Gymnasium: 0.29.1

Please make sure to understand what minimal means: https://github.com/DLR-RM/stable-baselines3/issues/982#issuecomment-1197044014 and https://stackoverflow.com/help/minimal-reproducible-example: I should be able to copy pate you code and run it. I don't have PPonew.py for example.

Apr 27 '24 17:04 qgallouedec

Despite it, it won't work. I'll try that too. I know what minimal is but I have provided files, and this is the least amount needed to run it. Tried my best to make it small.

Apr 27 '24 18:04 Hamza-101

stable-baselines3 stable-baselines3 copied to clipboard

Scaling Environment

🐛 Bug

Code example

Relevant log output / Error message

System Info

Checklist

stable-baselines3
stable-baselines3 copied to clipboard