VIGAN icon indicating copy to clipboard operation
VIGAN copied to clipboard

RuntimeError: The size of tensor a (784) must match the size of tensor b (28) at non-singleton dimension 3

Open raniasaidi opened this issue 4 years ago • 6 comments

Hi , Hope you are doing well. When i run python train.py, i have this error:

model [VIGANModel] was created
create web directory ./checkpoints/experiment_name/web...
Start training
step 1
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1569: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:432: UserWarning: Using a target size (torch.Size([1, 1, 28, 28])) that is different to the input size (torch.Size([1, 784])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)
Traceback (most recent call last):
  File "train.py", line 34, in <module>
    model.optimize_parameters_pretrain_AE()
  File "/content/VIGAN/models/VIGAN_model.py", line 282, in optimize_parameters_pretrain_AE
    self.backward_AE_pretrain()
  File "/content/VIGAN/models/VIGAN_model.py", line 184, in backward_AE_pretrain
    self.loss_AE_pre = self.criterionAE(AErealA, self.real_A) + self.criterionAE(AErealB, self.real_A)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py", line 432, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 2542, in mse_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/usr/local/lib/python3.6/dist-packages/torch/functional.py", line 62, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)
RuntimeError: The size of tensor a (784) must match the size of tensor b (28) at non-singleton dimension 3

Is there any solution for this ? Thank you

raniasaidi avatar May 29 '20 10:05 raniasaidi

Hi,

Thanks for your question. I haven't faced this problem before. Based on the log here, it seems that the size of one matrix is different from another matrix in this row:

"self.loss_AE_pre = self.criterionAE(AErealA, self.real_A) + self.criterionAE(AErealB, self.real_A)"

You can debug the code and check the variables of this row. Hope it will be useful for you.

Best.

chaoshangcs avatar May 29 '20 13:05 chaoshangcs

Hi @chaoshangcs, Hope you are doing well. I run the project in Google Colab and I have the same issue as @raniasaidi. The error persists in the line you mentioned and it's probably due to the AutoEncoder loss function. As you said you have not faced this problem before, I think it's a Pytorch version issue (I have an issue with cuda version as well and I fixed it). Google Colab actually uses torch 1.5.0 +cu101 version and torchversion 0.6.0 +cu101 version. Can you provide us with torch and torchversion versions you used for this project please? Thanks in advance !

amirosDev avatar Jun 01 '20 07:06 amirosDev

Hi @amirosDev, nice to e-meet you. Thanks for providing the possible way of solving this problem. We follow the code (https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) provided by Junyan. So the PyTorch version should be PyTorch 0.41+. Here CycleGAN project provides the requirements (https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/master/requirements.txt). Hope it will be helpful. As you guessed, since the PyTorch has been updated for several rounds, that might cause the problem @raniasaidi mentioned. Thanks again for your help.

chaoshangcs avatar Jun 01 '20 13:06 chaoshangcs

Hi again @chaoshangcs , thanks a lot for your answer. As I have said previously, it's a problem of torch and torchvision versions. Recently, I fixed the problem and I run the code on collab with Pytorch - 0.1.12.post2 version (Since it's an old version you should install it manually, follow instructions in this link https://pytorch.org/get-started/previous-versions/#old-pytorch-linux-binaries-compiled-with-cuda-75 for linux users) and torchvision - 0.1.9 version.

amirosDev avatar Jun 08 '20 17:06 amirosDev

@chaoshangcs , As I used Vigan, I think its purpose is to represent data in a common space and to impute missing views. When I run the code, I could not find common space (matrix of features resulting of the encoder of DAE) as a part of the output (I juste find some images and pth files). My question here, with the current code, can we extract features of the input images? and what is the content of the pth files? Thanks in advance !

amirosDev avatar Jun 08 '20 17:06 amirosDev

Hi @amirosDev, thanks for your question. First, it is so great to hear that you solve the previous problem. Second, for the question you mentioned, if my understanding is right, we can look at figure 2 in the paper. In the beginning, you can consider there are two spaces which represent two views. "G1" is the mapper that transfer the representation from view X (or space X) to view Y(or space Y), while "G2" is the mapper from view Y to view X. In addition, DAE does have a step to extract the features from two views. You can look at the code from "/models/networks.py". We define the class of AutoEncoder. The 'x' from the forward function is the features extracted from two views. Hope it will be useful. Thanks!

chaoshangcs avatar Jun 08 '20 17:06 chaoshangcs