open-reid icon indicating copy to clipboard operation
open-reid copied to clipboard

Zero-dimensional tensor concatenation problem

Open leobxpan opened this issue 7 years ago • 12 comments

Hi there,

Thank you for the code!

While training the ResNet50 model using the market1501 dataset, I got the following Runtime error:

Traceback (most recent call last):
  File "examples/triplet_loss.py", line 232, in <module>
    main(parser.parse_args())
  File "examples/triplet_loss.py", line 151, in main
    trainer.train(epoch, train_loader, optimizer)
  File "/home/bxpan/.local/lib/python3.5/site-packages/open_reid-0.2.0-py3.5.egg/reid/trainers.py", line 33, in train
  File "/home/bxpan/.local/lib/python3.5/site-packages/open_reid-0.2.0-py3.5.egg/reid/trainers.py", line 83, in _forward
  File "/home/bxpan/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/bxpan/.local/lib/python3.5/site-packages/open_reid-0.2.0-py3.5.egg/reid/loss/triplet.py", line 26, in forward
RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

The problem turned out to happen at this specific line of code in triplet.py: dist_ap = torch.cat(dist_ap) I have printed out dist_ap, which is a python list full of zero-dimensional (its printed-out size is: torch.Size([])) tensors (I used a batch-size of 64 so the list has a length of 64):

[tensor(0.2895, device='cuda:0'), tensor(0.3334, device='cuda:0'), tensor(0.3334, device='cuda:0'), tensor(0.3175, device='cuda:0'), tensor(0.3078, device='cuda:0'), tensor(0.3078, device='cuda:0'), tensor(0.3045, device='cuda:0'), tensor(0.3045, device='cuda:0'), tensor(0.2636, device='cuda:0'), tensor(0.2630, device='cuda:0'), tensor(0.2497, device='cuda:0'), tensor(0.2636, device='cuda:0'), tensor(0.2967, device='cuda:0'), tensor(0.2657, device='cuda:0'), tensor(0.2967, device='cuda:0'), tensor(0.2936, device='cuda:0'), tensor(0.3517, device='cuda:0'), tensor(0.2939, device='cuda:0'), tensor(0.3517, device='cuda:0'), tensor(0.3185, device='cuda:0'), tensor(0.3318, device='cuda:0'), tensor(0.3357, device='cuda:0'), tensor(0.3260, device='cuda:0'), tensor(0.3357, device='cuda:0'), tensor(0.2928, device='cuda:0'), tensor(0.2906, device='cuda:0'), tensor(0.2928, device='cuda:0'), tensor(0.2906, device='cuda:0'), tensor(0.1992, device='cuda:0'), tensor(0.2086, device='cuda:0'), tensor(0.2086, device='cuda:0'), tensor(0.2040, device='cuda:0'), tensor(0.2742, device='cuda:0'), tensor(0.2836, device='cuda:0'), tensor(0.3117, device='cuda:0'), tensor(0.3117, device='cuda:0'), tensor(0.2838, device='cuda:0'), tensor(0.2686, device='cuda:0'), tensor(0.2435, device='cuda:0'), tensor(0.2838, device='cuda:0'), tensor(0.3124, device='cuda:0'), tensor(0.3268, device='cuda:0'), tensor(0.3304, device='cuda:0'), tensor(0.3304, device='cuda:0'), tensor(0.2591, device='cuda:0'), tensor(0.2671, device='cuda:0'), tensor(0.2825, device='cuda:0'), tensor(0.2825, device='cuda:0'), tensor(0.3309, device='cuda:0'), tensor(0.2836, device='cuda:0'), tensor(0.3126, device='cuda:0'), tensor(0.3309, device='cuda:0'), tensor(0.3232, device='cuda:0'), tensor(0.3493, device='cuda:0'), tensor(0.3493, device='cuda:0'), tensor(0.3379, device='cuda:0'), tensor(0.3044, device='cuda:0'), tensor(0.3173, device='cuda:0'), tensor(0.3173, device='cuda:0'), tensor(0.3009, device='cuda:0'), tensor(0.2941, device='cuda:0'), tensor(0.3048, device='cuda:0'), tensor(0.3048, device='cuda:0'), tensor(0.2704, device='cuda:0')]

The values of the tensors seem to be of no problem, but the concatenation fails. Any idea about what the problem is?

Thank you very much.

Boxiao

leobxpan avatar May 08 '18 14:05 leobxpan

Hi,

I got the same problem recently. I think it is connected to a newer version of pytorch.

What worked for me is replacing torch.cat with torch.stack, but I am not entirely sure if this solution is unproblematic.

Regards Frank

frhf avatar May 10 '18 19:05 frhf

Hi Frank,

Sorry for the late response. I've tried your solution and it works. I'm not sure if torch.cat() can concatenate 1-dimensional tensors in previous versions of PyTorch. If it can, then this might be related to a version change. Better invite the author to check this @Cysu

Boxiao

leobxpan avatar May 16 '18 01:05 leobxpan

Yeah, I come across the same issue and my Pytorch version is 0.4. Hope the author @Cysu can look into this.

lijianhackthon avatar Jun 06 '18 13:06 lijianhackthon

Actually, this issue is fixed for me at 0.4.1

diaoenmao avatar Aug 08 '18 15:08 diaoenmao

@dem123456789 It works fine with pytorch 0.3.0. I saw this error on pytorch 0.4.0, so I upgraded it to 0.4.1. Problem still exists.

insikk avatar Sep 10 '18 03:09 insikk

I'm having a similar problem, where I cant concatenate the elements in a list of zero-dimensional tensors:

def basic_fun(x_cloned):
    res = []
    for i in range(len(x)):
        res.append(x_cloned[i] * x_cloned[i])
    print(res)
    return torch.cat(res)


def get_grad(inp, grad_var):
    A = basic_fun(inp)
    A.backward()
    return grad_var.grad


x = Variable(torch.FloatTensor([1, 2]), requires_grad=True)
x_cloned = x.clone()
print(get_grad(x_cloned, x))

Here are my terminal logs:

[tensor(1., grad_fn=<ThMulBackward>), tensor(4., grad_fn=<ThMulBackward>)]
Traceback (most recent call last):
  File "<path>/playground.py", line 23, in <module>
    print(get_grad(x_cloned, x))
  File "<path>/playground.py", line 16, in get_grad
    A = basic_fun(inp)
  File "<path>/playground.py", line 12, in basic_fun
    return torch.cat(res)
RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

mhyousefi avatar Sep 15 '18 06:09 mhyousefi

Alright so apparently, I need to do torch.stack(res, dim=0) It produces: tensor([1., 4.], grad_fn=<StackBackward>)

mhyousefi avatar Sep 15 '18 07:09 mhyousefi

I'm having a similar problem, where I cant concatenate the elements in a list of zero-dimensional tensors:

def basic_fun(x_cloned):
    res = []
    for i in range(len(x)):
        res.append(x_cloned[i] * x_cloned[i])
    print(res)
    return torch.cat(res)

By slicing items of one-dimensional tensors you get zero-dimensional tensors that cannot be concatenated. To force getting one-dimensional tensors you can slice x_cloned[i, None].

Side note: I am not sure what you are doing in production, but element-wise multiplication in pytorch is easily done using the * operator:

def basic_fun(x_cloned):
    return x_cloned * x_cloned

Florian1990 avatar Oct 15 '18 15:10 Florian1990

Another option is to use unsqueeze to turn a 0-dim tensor into a 1-dim tensor: res.append((x_cloned[i] * x_cloned[i]).unsqueeze(0))

GR4HAM avatar Jan 22 '19 11:01 GR4HAM

Well, it seems that last versions supported that operation, but from 0.4, you should unsqueeze the tensors which are the elements of the list. You can do that with

for i in range(len(lst)): list[i] = torch.unsqueeze(list[i], dim = -1)

So the list should look like this. [tensor([0.2895], device='cuda:0'), tensor([0.3895], device='cuda:0')... ]

themis0888 avatar Feb 07 '19 11:02 themis0888

I can run the code in pytorch in 0.4.1 version. you need to change the triplet.py(reid/loss triplet.py) image

wujunyi627 avatar Mar 10 '19 02:03 wujunyi627

Hi,

I got the same problem recently. I think it is connected to a newer version of pytorch.

What worked for me is replacing torch.cat with torch.stack, but I am not entirely sure if this solution is unproblematic.

Regards Frank

It's useful

xinyi97 avatar Mar 30 '22 08:03 xinyi97