tmrl icon indicating copy to clipboard operation
tmrl copied to clipboard

Facing issues while running `python -m tmrl --test`

Open Harish-ioc opened this issue 8 months ago • 6 comments

C:\Windows\System32>python -m tmrl --test
INFO:numexpr.utils:Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
INFO:root:Namespace(server=False, trainer=False, worker=False, test=True, benchmark=False, record_reward=False, check_env=False, no_wandb=False, config={})
INFO:root:11/06/23 16:05:52 server IP: 127.0.0.1
C:\Users\haris\anaconda3\Lib\site-packages\gymnasium\core.py:311: UserWarning: WARN: env.default_action to get variables from other wrappers is deprecated and will be removed in v1.0, to get this variable you can do `env.unwrapped.default_action` for environment variables or `env.get_wrapper_attr('default_action')` that will search the reminding wrappers.
  logger.warn(
Exception in thread Thread-2 (__client_thread):
Traceback (most recent call last):
  File "C:\Users\haris\anaconda3\Lib\threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "C:\Users\haris\anaconda3\Lib\threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\custom\utils\tools.py", line 41, in __client_thread
    s.connect((self._host, self._port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\__main__.py", line 82, in <module>
    main(arguments)
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\__main__.py", line 41, in main
    rw.run_episodes(10000)
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\networking.py", line 648, in run_episodes
    self.run_episode(max_samples_per_episode, train=train)
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\networking.py", line 663, in run_episode
    obs, info = self.reset(collect_samples=False)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\networking.py", line 561, in reset
    new_obs, info = self.env.reset()
                    ^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\gymnasium\core.py", line 467, in reset
    return self.env.reset(seed=seed, options=options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\gymnasium\core.py", line 515, in reset
    obs, info = self.env.reset(seed=seed, options=options)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\gymnasium\wrappers\order_enforcing.py", line 61, in reset
    return self.env.reset(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\rtgym\envs\real_time_env.py", line 514, in reset
    elt, info = self.interface.reset(seed=seed, options=options)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\custom\custom_gym_interfaces.py", line 157, in reset
    data, img = self.grab_data_and_img()
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\custom\custom_gym_interfaces.py", line 134, in grab_data_and_img    data = self.client.retrieve_data()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\haris\anaconda3\Lib\site-packages\tmrl\custom\utils\tools.py", line 72, in retrieve_data
    assert t_now - t_start < timeout, f"OpenPlanet stopped sending data since more than {timeout}s."
           ^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: OpenPlanet stopped sending data since more than 10.0s.

Harish-ioc avatar Nov 06 '23 11:11 Harish-ioc

Hi, hmm it looks like your OpenPlanet stopped communicating with tmrl for more than 10 seconds for some reason ? When this happens, the environment throws an exception to avoid corrupting the replay buffer with meaningless samples in case OpenPlanet does eventually respond after more than 10 seconds.

In which situation did you encounter this exception?

yannbouteiller avatar Nov 11 '23 05:11 yannbouteiller

Closing for inactivity, please feel free to reopen if you encounter a similar issue.

yannbouteiller avatar Dec 08 '23 15:12 yannbouteiller

Hey i have quite literally the same exact issue, tried running trackmania and cmd in administrator.

C:\Windows\system32>python -m tmrl --test INFO:root:03/16/24 18:50:41 server IP: 127.0.0.1 Exception in thread Thread-2: Traceback (most recent call last): File "C:\Program Files\Python38\lib\threading.py", line 932, in _bootstrap_inner self.run() File "C:\Program Files\Python38\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\custom\utils\tools.py", line 41, in __client_thread s.connect((self._host, self._port)) ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it Traceback (most recent call last): File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main return run_code(code, main_globals, None, File "C:\Program Files\Python38\lib\runpy.py", line 87, in run_code exec(code, run_globals) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl_main.py", line 84, in main(arguments) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl_main.py", line 43, in main rw.run_episodes(10000) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\networking.py", line 670, in run_episodes self.run_episode(max_samples_per_episode, train=train) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\networking.py", line 688, in run_episode obs, info = self.reset(collect_samples=False) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\networking.py", line 571, in reset new_obs, info = self.env.reset() File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\gymnasium\core.py", line 467, in reset return self.env.reset(seed=seed, options=options) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\gymnasium\wrappers\order_enforcing.py", line 61, in reset return self.env.reset(**kwargs) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\rtgym\envs\real_time_env.py", line 514, in reset elt, info = self.interface.reset(seed=seed, options=options) File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\custom\custom_gym_interfaces.py", line 148, in reset data, img = self.grab_data_and_img() File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\custom\custom_gym_interfaces.py", line 125, in grab_data_and_img data = self.client.retrieve_data() File "C:\Users\Jojo\AppData\Roaming\Python\Python38\site-packages\tmrl\custom\utils\tools.py", line 72, in retrieve_data assert t_now - t_start < timeout, f"OpenPlanet stopped sending data since more than {timeout}s." AssertionError: OpenPlanet stopped sending data since more than 10.0s.

PorkDevMode avatar Mar 16 '24 23:03 PorkDevMode

Hi @PorkDevMode , does this happen after a while or are you entirely unable to run the AI at all?

I see this in your traceback:

ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it

This seems to indicate that the localhost TCP connection with OpenPlanet could not be established.

yannbouteiller avatar Mar 17 '24 14:03 yannbouteiller

nope just happens every time

PorkDevMode avatar Mar 17 '24 18:03 PorkDevMode

Did you double check that the OpenPlanet script is running properly? If it is, probably there is some app that is using port 9000. Sadly at the moment there is no way of customizing this port other than changing the OpenPlanet script directly.

yannbouteiller avatar Mar 17 '24 23:03 yannbouteiller