PytorchSSD icon indicating copy to clipboard operation
PytorchSSD copied to clipboard

[Errno 111] Connection refused

Open nagitam opened this issue 6 years ago • 22 comments

Hi, I am getting this error RuntimeError: The shape of the mask [8, 32756] at index 0 does not match the shape of the indexed tensor [262048, 1] at index 0 Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f801b6ccf28>>'

I am using pytorch 0.4.0. and python 3.5 anaconda 4.5.9. Any help? Thanks

nagitam avatar Sep 24 '18 23:09 nagitam

Please refer to the code of 0.4 branch, I am working on the pytorch 4.1 .

lzx1413 avatar Sep 25 '18 06:09 lzx1413

Hi! When I train the ssd vgg, I met this error. Traceback (most recent call last): File "train_test.py", line 424, in train() File "train_test.py", line 289, in train loss_l, loss_c = criterion(out, priors, targets) File "/home/lxt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/lxt/study/PytorchSSD/layers/modules/multibox_loss.py", line 97, in forward _,loss_idx = loss_c.sort(1, descending=True) RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered @lzx1413

lxtGH avatar Sep 29 '18 14:09 lxtGH

Thanks You!

On Tue, 25 Sep 2018 at 4:01 pm, lzx1413 [email protected] wrote:

Please refer to the code of 0.4 branch, I am working on the pytorch 4.1 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lzx1413/PytorchSSD/issues/48#issuecomment-424217200, or mute the thread https://github.com/notifications/unsubscribe-auth/AYnAX34FvXgZrrExhU6ioRl6HA7j3x4Bks5uecafgaJpZM4W3n11 .

nagitam avatar Sep 29 '18 15:09 nagitam

@lxtGH this is because the loss is nan, you can ignore it or reduce the lr_rate to avoid it.

lzx1413 avatar Oct 02 '18 09:10 lzx1413

我用的pytorch是0.3.1,cuda8.0,ubuntu16 Traceback (most recent call last): File "/home/cv2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 333, in del self._shutdown_workers() File "/home/cv2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 319, in _shutdown_workers self.data_queue.get() File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/queues.py", line 337, in get return _ForkingPickler.loads(res) File "/home/cv2018/anaconda3/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd fd = df.detach() File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/connection.py", line 487, in Client c = SocketClient(address) File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient s.connect(address) ConnectionRefusedError: [Errno 111] Connection refused

Eileen2014 avatar Oct 12 '18 00:10 Eileen2014

Please refer to the code of 0.4 branch, I am working on the pytorch 4.1 .

@lzx1413 ,Hello sir, Did you run the demo correctly on the pytorch 4.1? I try to run the demo on the pytorch 0.4.0, 0.4.1 and 0.3.1 , all of which received the error that [Errno 111] Connection refused . Can you help me with that? Thanks a lot

moyan007 avatar Nov 22 '18 11:11 moyan007

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

weycui avatar Dec 10 '18 01:12 weycui

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

pytorch version

moyan007 avatar Dec 10 '18 02:12 moyan007

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

pytorch version

麻烦您说一下哪个版本匹配会好一点。 我现在用的是pytorch0.4 python3.6

weycui avatar Dec 10 '18 02:12 weycui

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

pytorch version

麻烦您说一下哪个版本匹配会好一点。 我现在用的是pytorch0.4 python3.6

我记得好像0.3.1可以直接运行,0.4+的版本的话改了之后同样可以运行,修改的地方应该是在multiBoxLoss部分

moyan007 avatar Dec 10 '18 02:12 moyan007

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

pytorch version

麻烦您说一下哪个版本匹配会好一点。 我现在用的是pytorch0.4 python3.6

我记得好像0.3.1可以直接运行,0.4+的版本的话改了之后同样可以运行,修改的地方应该是在multiBoxLoss部分

嗯嗯谢谢,我在仔细阅读一下原码

weycui avatar Dec 10 '18 02:12 weycui

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

您好,请问这个问题您解决了吗?方便给我解答一下吗?

InvictusY avatar Dec 11 '18 10:12 InvictusY

您好,我也遇到了同样的问题。 是什么原因造成的呢?Ubuntu还是pytorch 版本?

您好,请问这个问题您解决了吗?方便给我解答一下吗?

还没有

weycui avatar Dec 11 '18 10:12 weycui

Please refer to the code of 0.4 branch, I am working on the pytorch 4.1 .

@lzx1413 ,Hello sir, Did you run the demo correctly on the pytorch 4.1? I try to run the demo on the pytorch 0.4.0, 0.4.1 and 0.3.1 , all of which received the error that [Errno 111] Connection refused . Can you help me with that? Thanks a lot

不好意思打扰了,这个链接错误我还是没有解决。请问您是怎么解决的

weycui avatar Dec 18 '18 02:12 weycui

RuntimeError: The shape of the mask [8, 32756] at index 0 does not match the shape of the indexed tensor [262048, 1] at index 0 This problem can be solved as followes: 1、 loss_c[pos] = 0 to: loss_c = loss_c.view(num, -1) loss_c[pos] = 0

2、 N = num_pos.data.sum() to: N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double()

miaoshuyu avatar Dec 19 '18 14:12 miaoshuyu

Please refer to the code of 0.4 branch, I am working on the pytorch 4.1 .

@lzx1413 ,Hello sir, Did you run the demo correctly on the pytorch 4.1? I try to run the demo on the pytorch 0.4.0, 0.4.1 and 0.3.1 , all of which received the error that [Errno 111] Connection refused . Can you help me with that? Thanks a lot

Please refer to the code of 0.4 branch, I am working on the pytorch 4.1 .

@lzx1413 ,Hello sir, Did you run the demo correctly on the pytorch 4.1? I try to run the demo on the pytorch 0.4.0, 0.4.1 and 0.3.1 , all of which received the error that [Errno 111] Connection refused . Can you help me with that? Thanks a lot

Hello, I have the same problem, if I use torch0.4.0 I will receive a connection reject [111] error. I am using torch0.3.1 and can run it. But when training SSD, run ssd error: value cannot be converted to type float without overflow: inf。 look forward to your reply

cherishfive avatar Mar 19 '19 01:03 cherishfive

Traceback (most recent call last): File "train_test.py", line 455, in train() File "train_test.py", line 321, in train loss_l, loss_c = criterion(out, priors, targets) File "/home/yangyuze/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/yangyuze/PytorchSSD/layers/modules/multibox_loss.py", line 96, in forward loss_c[pos] = 0 # filter out pos boxes for now RuntimeError: The shape of the mask [8, 32756] at index 0 does not match the shape of the indexed tensor [262048, 1] at index 0 @lzx1413 作者你好我碰到这个问题,请问这个是什么原因造成的?能给出一点解决意见吗

YYZ-rose avatar Mar 24 '19 12:03 YYZ-rose

Traceback (most recent call last): File "train_test.py", line 455, in train() File "train_test.py", line 321, in train loss_l, loss_c = criterion(out, priors, targets) File "/home/yangyuze/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/yangyuze/PytorchSSD/layers/modules/multibox_loss.py", line 96, in forward loss_c[pos] = 0 # filter out pos boxes for now RuntimeError: The shape of the mask [8, 32756] at index 0 does not match the shape of the indexed tensor [262048, 1] at index 0 @lzx1413 作者你好我碰到这个问题,请问这个是什么原因造成的?能给出一点解决意见吗

can be solved by comment of miaoshuyu or update your pytorch to 4.1 or higher

oceanbomber avatar Mar 25 '19 08:03 oceanbomber

It seems that change pytorch version to 0.4.1 is ok, and got problems on pytorch 0.3.1 and pytorch 1.0. On 1.1, the error is "RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered"

ZiqiChai avatar Mar 29 '19 05:03 ZiqiChai

pytorch1.0亲测有效

  1. loss_c[pos] = 0 to: loss_c[pos.view(-1,1)] = 0

  2. N = num_pos.data.sum() to: N = max(num_pos.data.sum().float(),1) image

LZZhideonbush avatar May 16 '19 12:05 LZZhideonbush

Hi, I am getting this error RuntimeError: The shape of the mask [8, 32756] at index 0 does not match the shape of the indexed tensor [262048, 1] at index 0 Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f801b6ccf28>>'

I am using pytorch 0.4.0. and python 3.5 anaconda 4.5.9. Any help? Thanks pytorch 0.4.0 test is vaild! RuntimeError: The shape of the mask [8, 32756] at index 0 does not match the shape of the indexed tensor [262048, 1] at index 0 This problem can be solved as followes:

Open file: ~/PytorchSSD-master/layers/modules/multibox_loss.py 1、 loss_c[pos] = 0 replace to: loss_c = loss_c.view(num, -1) loss_c[pos] = 0

2、 N = num_pos.data.sum() replace to: N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double()

feiXueQingXin avatar Dec 01 '19 14:12 feiXueQingXin

我用的pytorch是0.3.1,cuda8.0,ubuntu16 Traceback (most recent call last): File "/home/cv2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 333, in del self._shutdown_workers() File "/home/cv2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 319, in _shutdown_workers self.data_queue.get() File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/queues.py", line 337, in get return _ForkingPickler.loads(res) File "/home/cv2018/anaconda3/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd fd = df.detach() File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/connection.py", line 487, in Client c = SocketClient(address) File "/home/cv2018/anaconda3/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient s.connect(address) ConnectionRefusedError: [Errno 111] Connection refused

请问您的问题解决了吗

Alice-k98 avatar May 22 '20 17:05 Alice-k98