oprl icon indicating copy to clipboard operation
oprl copied to clipboard

BrokenPipeError after torch_mp.Process(target=learner_worker, ..) on windows

Open Geniukx opened this issue 1 year ago • 0 comments

Its my first time to raise an issue on github , if any inappropriate please point out directly :)

platform: windows 11 python version: 3.7.16

I run the code python train.py --config configs/openai/d4pg/walker2d_d4pg.yml with num_steps_train set as 5_000. Everything is ok until the training step reach 5000, and cmd print "Exit Learner". Then something wrong associated with threads occured as follows.

I'm wondering if it's because of the incompatibility with windows platform or the python version? Thanks a lot :)

Training step  4000
Agent: 1  episode 900
Agent: 3  episode 900
Agent: 0  episode 1000
Agent: 0  episode 500
Training step  5000
Exit learner.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 302, in _recv_bytes
    overlapped=True)
BrokenPipeError: [WinError 109] 管道已结束。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\site-packages\tensorboardX\event_file_writer.py", line 202, in run
    data = self._queue.get(True, queue_wait_duration)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\queues.py", line 108, in get
    res = self._recv_bytes()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

Agent 0 done.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 302, in _recv_bytes
    overlapped=True)
BrokenPipeError: [WinError 109] 管道已结束。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\site-packages\tensorboardX\event_file_writer.py", line 202, in run
    data = self._queue.get(True, queue_wait_duration)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\queues.py", line 108, in get
    res = self._recv_bytes()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

Stop sampler worker.
Agent 0 done.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 302, in _recv_bytes
    overlapped=True)
BrokenPipeError: [WinError 109] 管道已结束。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\site-packages\tensorboardX\event_file_writer.py", line 202, in run
    data = self._queue.get(True, queue_wait_duration)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\queues.py", line 108, in get
    res = self._recv_bytes()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

Exception in thread Thread-1:
Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 302, in _recv_bytes
    overlapped=True)
BrokenPipeError: [WinError 109] 管道已结束。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\site-packages\tensorboardX\event_file_writer.py", line 202, in run
    data = self._queue.get(True, queue_wait_duration)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\queues.py", line 108, in get
    res = self._recv_bytes()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

Agent 3 done.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 302, in _recv_bytes
    overlapped=True)
BrokenPipeError: [WinError 109] 管道已结束。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\site-packages\tensorboardX\event_file_writer.py", line 202, in run
    data = self._queue.get(True, queue_wait_duration)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\queues.py", line 108, in get
    res = self._recv_bytes()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\site-packages\tensorboardX\event_file_writer.py", line 202, in run
    data = self._queue.get(True, queue_wait_duration)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\queues.py", line 108, in get
    res = self._recv_bytes()
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "D:\Development\Anaconda\envs\d4pg_3.7\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

Geniukx avatar May 05 '23 16:05 Geniukx