crnn.pytorch
crnn.pytorch copied to clipboard
ConnectionResetError: [Errno 104] Connection reset by peer
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f03df4789e8>> Traceback (most recent call last): File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 399, in del self._shutdown_workers() File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers self.worker_result_queue.get() File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/multiprocessing/queues.py", line 337, in get return ForkingPickler.loads(res) File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd fd = df.detach() File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/multiprocessing/resource_sharer.py", line 58, in detach return reduction.recv_handle(conn) File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/multiprocessing/reduction.py", line 181, in recv_handle return recvfds(s, 1)[0] File "/root/anaconda3/envs/jiangxiluning-train3.5/lib/python3.5/multiprocessing/reduction.py", line 152, in recvfds msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size)) ConnectionResetError: [Errno 104] Connection reset by peer epoch:3,step:65,Test loss:0.0013998962240293622,accuracy:0.0,train loss:0.0014686216600239277 The step 66,last lost:0.0014593754895031452, current: 0.0013998962240293622,save model!
have you solved this problem? I have the same problem now
the same problem +1
the same problem +1!
I have modified the parameter in line 22 of train.py to 0, no error is reported during training. Change
parser.add_argument('--workers', type=int, help='number of data loading workers', default=2)
to parser.add_argument('--workers', type=int, help='number of data loading workers', default=0)
.