ssd.pytorch
ssd.pytorch copied to clipboard
RuntimeError: The shape of the mask [32, 8732] at index 0 does not match the shape of the indexed tensor [279424, 1] at index 0
rps@rps:~/桌面/ssd.pytorch$ python3 train.py
/home/rps/桌面/ssd.pytorch/ssd.py:34: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
self.priors = Variable(self.priorbox.forward(), volatile=True)
/home/rps/桌面/ssd.pytorch/layers/modules/l2norm.py:17: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
init.constant(self.weight,self.gamma)
Loading base network...
Initializing weights...
train.py:214: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
init.xavier_uniform(param)
Loading the dataset...
Training SSD on: VOC0712
Using the specified args:
Namespace(basenet='vgg16_reducedfc.pth', batch_size=32, cuda=True, dataset='VOC', dataset_root='/home/rps/data/VOCdevkit/', gamma=0.1, lr=0.001, momentum=0.9, num_workers=4, resume=None, save_folder='weights/', start_iter=0, visdom=False, weight_decay=0.0005)
train.py:169: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
targets = [Variable(ann.cuda(), volatile=True) for ann in targets]
Traceback (most recent call last):
File "train.py", line 255, in
anyone helps,please...
I have the same error.Using Pytorch0.4+python3.5.
python3.5 and pytorch 0.3.0 no problem
I have the same error,if I switch the lines 96,97 loss_c = loss_c.view(num, -1) loss_c[pos] = 0 in multibox_loss.py, this error disappear. But come with another error : "File "/home/.../ssd.pytorch/layers/modules/multibox_loss.py", line 115, in forward loss_l /= N RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'" The type of tensor is not match, how can I fix it ?
@xscjun change line:
N = num_pos.data.sum()
to:
N = num_pos.data.sum().double()
loss_l = loss_l.double()
loss_c = loss_c.double()
this should work
Anyone has solved this problem? help me tks.
I have the same error,if I switch the lines 96,97 loss_c = loss_c.view(num, -1) loss_c[pos] = 0 in multibox_loss.py, this error disappear. But come with another error : "File "/home/.../ssd.pytorch/layers/modules/multibox_loss.py", line 115, in forward loss_l /= N RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'" The type of tensor is not match, how can I fix it ?
The “pos” -> torch.Size([32, 8732]) The “loss_c ” ->torch.Size([279424, 1]) when I add one line as :
loss_c = loss_c.view(pos.size()[0], pos.size()[1]) #add line
loss_c[pos] = 0 # filter out pos boxes for now
loss_c = loss_c.view(num, -1)
Then it worked.
I have the same error,if I switch the lines 96,97 loss_c = loss_c.view(num, -1) loss_c[pos] = 0 in multibox_loss.py, this error disappear. But come with another error : "File "/home/.../ssd.pytorch/layers/modules/multibox_loss.py", line 115, in forward loss_l /= N RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'" The type of tensor is not match, how can I fix it ?
i have the same error, and how did you solve it finally?
I have the same error,if I switch the lines 96,97 loss_c = loss_c.view(num, -1) loss_c[pos] = 0 in multibox_loss.py, this error disappear. But come with another error : "File "/home/.../ssd.pytorch/layers/modules/multibox_loss.py", line 115, in forward loss_l /= N RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'" The type of tensor is not match, how can I fix it ?
i have the same error, so how could you figure it out finally?
What file should be updated?
I have the same error,if I switch the lines 96,97 loss_c = loss_c.view(num, -1) loss_c[pos] = 0 in multibox_loss.py, this error disappear. But come with another error : "File "/home/.../ssd.pytorch/layers/modules/multibox_loss.py", line 115, in forward loss_l /= N RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'" The type of tensor is not match, how can I fix it ?
change the data type of N to FloatTensor.
What file should be updated?
You may try to update your file /home/.../ssd.pytorch/layers/modules/multibox_loss.py
, and add one line as @LZP4GitHub said above.
@usherbob python3.6+pytorch0.4.1, I added "loss_c = loss_c.view(pos.size()[0], pos.size()[1]) #add line", but I have another issue. RuntimeError: copy_if failed to synchronize: device-side assert triggered
Finally, I succeeded. step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line144 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double()
Finally, I succeeded. step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line144 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double()
I changed like this, but there was a RuntimeError still: RuntimeError: device-side assert triggered How can I fix it ? Looking forward to your reply.Thank you!
by changing the order of line 97 and 98 it throws a new error for me
Traceback (most recent call last):
File "train.py", line 254, in <module>
train()
File "train.py", line 182, in train
loc_loss += loss_l.data[0]
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
any suggestions?
PS: I tried as well converting the loss to double as mentioned above and still the same error!
### solved apparently 'loss_l.data[0]' should be replaced with 'loss_l.item()' instead this replacement applies on every loss_x.data[0] in the file!
Finally, I succeeded. step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line144 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double()
很棒,但是有个小bug,是line 114,不是line 144
If your Python torch version is '0.4.1' ,you can change follow step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line114 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double() But if your python torch version is 1.0.1,that change is no useful.
I solve the problem if your python torch version is 1.0.1. The solution as follow 1-3 steps: step1 and step2 change the multibox_loss.py! step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line114 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double() setp 3 change the train.py! step3: change the line188,189,193,196: loss_l.data[0] >> loss_l.data loss_c.data[0] >> loss_c.data loss.data[0] >> loss.data
loss is increasing as shown below
timer: 2.2050 sec. iter 0 || Loss: 153.4730 || timer: 1.8316 sec. iter 10 || Loss: 48.9679 || timer: 1.8920 sec. iter 20 || Loss: 191.8098 || timer: 2.0969 sec. iter 30 || Loss: 110.8081 || timer: 1.8849 sec. iter 40 || Loss: 106.9749 || timer: 1.9373 sec. iter 50 || Loss: 134.3674 || timer: 2.0012 sec. . .
help me to solve the issue.
I solve the problem if your python torch version is 1.0.1. The solution as follow 1-3 steps: step1 and step2 change the multibox_loss.py! step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line114 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double() setp 3 change the train.py! step3: change the line188,189,193,196: loss_l.data[0] >> loss_l.data loss_c.data[0] >> loss_c.data loss.data[0] >> loss.data
thanks,that is usefully for me,but ,step3 is:line 183,184,188,191, 5 item ,loss_x.data[0] >> loss_x.data or loss.data[0] >> loss.data
would be loss_x.data[0] >> loss_x.item() better?
@TianSong1991 Thanks a lot.Pytorch 1.0+Python 3.5 success!
PS: I tried as well converting the loss to double as mentioned above and still the same error!
much obligated!
I solve the problem if your python torch version is 1.0.1. The solution as follow 1-3 steps: step1 and step2 change the multibox_loss.py! step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line114 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double() setp 3 change the train.py! step3: change the line188,189,193,196: loss_l.data[0] >> loss_l.data loss_c.data[0] >> loss_c.data loss.data[0] >> loss.data
but loss is nan
@TianSong1991 Thanks a lot.Pytorch 1.0+Python 3.5 success! but loss is nan
I solve the problem if your python torch version is 1.0.1. The solution as follow 1-3 steps: step1 and step2 change the multibox_loss.py! step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line114 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double() setp 3 change the train.py! step3: change the line188,189,193,196: loss_l.data[0] >> loss_l.data loss_c.data[0] >> loss_c.data loss.data[0] >> loss.data
but loss is nan
I have the same problem. Why loss is nan?
If your Python torch version is '0.4.1' ,you can change follow step1: switch the two lines 97,98: loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now step2: change the line114 N = num_pos.data.sum() to N = num_pos.data.sum().double() loss_l = loss_l.double() loss_c = loss_c.double() But if your python torch version is 1.0.1,that change is no useful.
Hi , why don`t the loss_l divide by N?
Same problem here.
I used the @ LZP4GitHub solution and it is working fine, but i don't understand what is the difference between its solution and https://github.com/amdegroot/ssd.pytorch/pull/322 this one.
I have the same error.Using Pytorch1.1+python3.6
loss_c[pos] = 0 # filter out pos boxes for now IndexError: The shape of the mask [32, 8732] at index 0 does not match the shape of the indexed tensor [279424, 1] at index 0
Pytorch version:
>>> import torch
>>> print(torch.__version__)
1.1.0
Python version:
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
multibox_loss.py:
Switch the two lines 97,98:
loss_c = loss_c.view(num, -1)
loss_c[pos] = 0 # filter out pos boxes for now
Change line114
N = num_pos.data.sum() -> N = num_pos.data.sum().double()
and change the following two lines to:
loss_l = loss_l.double()
loss_c = loss_c.double()
train.py
loss_l.data[0] >> loss_l.data
loss_c.data[0] >> loss_c.data
loss.data[0] >> loss.data
And here is my output:
timer: 11.9583 sec.
iter 0 || Loss: 11728.9388 || timer: 0.2955 sec.
iter 10 || Loss: nan || timer: 0.2843 sec.
iter 20 || Loss: nan || timer: 0.2890 sec.
iter 30 || Loss: nan || timer: 0.2934 sec.
iter 40 || Loss: nan || timer: 0.2865 sec.
iter 50 || Loss: nan || timer: 0.2855 sec.
iter 60 || Loss: nan || timer: 0.2889 sec.
iter 70 || Loss: nan || timer: 0.2857 sec.
iter 80 || Loss: nan || timer: 0.2843 sec.
iter 90 || Loss: nan || timer: 0.2835 sec.
iter 100 || Loss: nan || timer: 0.2846 sec.
iter 110 || Loss: nan || timer: 0.2946 sec.
iter 120 || Loss: nan || timer: 0.2860 sec.
iter 130 || Loss: nan || timer: 0.2846 sec.
iter 140 || Loss: nan || timer: 0.2962 sec.
iter 150 || Loss: nan || timer: 0.2989 sec.
iter 160 || Loss: nan || timer: 0.2857 sec.