pensieve-pytorch icon indicating copy to clipboard operation
pensieve-pytorch copied to clipboard

Failure with multiprocessing

Open zhh210 opened this issue 2 years ago • 4 comments

Is there any issue with multiprocessing in Python2.7? It keeps giving an error (--model_type=2):

Train ep:100,time use :86s

Testing model restored.
Process Process-1:
Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
    self.run()
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "pensieve_torch.py", line 133, in central_agent
    s_batch, a_batch, r_batch, terminal, info = exp_queues[i].get()
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/multiprocessing/queues.py", line 117, in get
    res = self._recv()
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/torch/multiprocessing/queue.py", line 22, in recv
    return pickle.loads(buf)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/pickle.py", line 1388, in loads
    return Unpickler(file).load()
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/pickle.py", line 864, in load
    dispatch[key](self)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/pickle.py", line 1139, in load_reduce
    value = func(*args)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/torch/multiprocessing/reductions.py", line 287, in rebuild_storage_fd
    fd = multiprocessing.reduction.rebuild_handle(df)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/multiprocessing/reduction.py", line 155, in rebuild_handle
    conn = Client(address, authkey=current_process().authkey)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/multiprocessing/connection.py", line 169, in Client
    c = SocketClient(address)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/multiprocessing/connection.py", line 308, in SocketClient
    s.connect(address)
  File "/home/ec2-user/anaconda3/envs/pytorch_p27/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 2] No such file or directory

zhh210 avatar Mar 08 '22 19:03 zhh210