sumo-rl icon indicating copy to clipboard operation
sumo-rl copied to clipboard

TypeError: float() argument must be a string or a number, not 'dict'

Open gioannides opened this issue 1 year ago • 0 comments

Error:

Step #0.00 (0ms ?*RT. ?UPS, TraCI: 112ms, vehicles TOT 0 ACT 0 BUF 0)                    
Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
 Retrying in 1 seconds
Traceback (most recent call last):
  File "ppo-trial.py", line 30, in <module>
    model.learn(total_timesteps=3600)
  File "/home/anaconda3/envs/sumo_rl_pip/lib/python3.7/site-packages/stable_baselines3/ppo/ppo.py", line 310, in learn
    reset_num_timesteps=reset_num_timesteps,
  File "/home/anaconda3/envs/sumo_rl_pip/lib/python3.7/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 230, in learn
    total_timesteps, eval_env, callback, eval_freq, n_eval_episodes, eval_log_path, reset_num_timesteps, tb_log_name
  File "/home/anaconda3/envs/sumo_rl_pip/lib/python3.7/site-packages/stable_baselines3/common/base_class.py", line 421, in _setup_learn
    self._last_obs = self.env.reset()  # pytype: disable=annotation-type-mismatch
  File "/home/anaconda3/envs/sumo_rl_pip/lib/python3.7/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 62, in reset
    self._save_obs(env_idx, obs)
  File "/home/anaconda3/envs/sumo_rl_pip/lib/python3.7/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 92, in _save_obs
    self.buf_obs[key][env_idx] = obs
TypeError: float() argument must be a string or a number, not 'dict'

Code:

import gym
from stable_baselines3.ppo import PPO
import os
import sys
if 'SUMO_HOME' in os.environ:
    tools = os.path.join(os.environ['SUMO_HOME'], 'tools')
    sys.path.append(tools)
else:
    sys.exit("Please declare the environment variable 'SUMO_HOME'")
from sumo_rl import SumoEnvironment
import traci


if __name__ == '__main__':

    env = SumoEnvironment(net_file='./terminal-.net.xml',
                            route_file='./terminal-.rou.xml',
                            out_csv_name='ppo-',
                            single_agent=False,
                            use_gui=True,
                            waiting_time_memory=2000,
                            num_seconds=3600)

    model = PPO(
        env=env,
        policy="MlpPolicy",
        learning_rate=0.001,
        verbose=1
    )
    model.learn(total_timesteps=3600)

Network file: https://drive.google.com/file/d/1D5I75aG1s5wjX1FGgFvAq7IRawJD4BJR/view?usp=sharing

The network file is the same as the previous issue I opened. I am trying to migrate to Deep RL and so using stable baselines 3.

gioannides avatar Oct 13 '22 07:10 gioannides