RFBNet icon indicating copy to clipboard operation
RFBNet copied to clipboard

Connection Refused Error

Open rw1995 opened this issue 5 years ago • 3 comments

Excuse me, can you help me? When I follow your steps to run demo(python train_RFB.py -d VOC -v RFB_vgg -s 300 ), there's a Error

Loading base network... Initializing weights... Loading Dataset... Training RFB_vgg on VOC0712 THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=844 error=11 : invalid argument Traceback (most recent call last): File "train_RFB.py", line 257, in train() File "train_RFB.py", line 220, in train out = net(images) File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, **kwargs) File "/home/csy/RFBNet/models/RFB_Net_vgg.py", line 186, in forward x = self.basek File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, **kwargs) File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 301, in forward self.padding, self.dilation, self.groups) RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:844 Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f260fb67470>> Traceback (most recent call last): File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 349, in del self._shutdown_workers() File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers self.worker_result_queue.get() File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/multiprocessing/queues.py", line 345, in get return ForkingPickler.loads(res) File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd fd = df.detach() File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/multiprocessing/resource_sharer.py", line 87, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/multiprocessing/connection.py", line 487, in Client c = SocketClient(address) File "/home/csy/anaconda3/envs/pytorch0.4.0/lib/python3.5/multiprocessing/connection.py", line 614, in SocketClient s.connect(address) Connection Refused Error: [Errno 111] Connection refused

where is wrong ? i use ubuntu 16.04 +cuda9+cudnn v7+anaconda3+python3+

rw1995 avatar Dec 27 '18 05:12 rw1995

@ruinmessi

rw1995 avatar Dec 27 '18 05:12 rw1995

I meet this problem in my experiments too, do you have any solution? @rw1995

linshoa avatar Jun 16 '19 09:06 linshoa

@linshoa hi, I meet this problem , do you solved it? thanks

Damon2019 avatar Aug 29 '19 12:08 Damon2019