video-gcp icon indicating copy to clipboard operation
video-gcp copied to clipboard

Dimension mismatch

Open KyleHuo opened this issue 1 year ago • 1 comments

Thank you for sharing the nice work. I try to reproduce the results according to the provided code. I met a error about dimension mismatch. I am new to this field and the dataset so I can not deal with it in this time. Hope someone can help me.

starting epoch 0 Traceback (most recent call last): File "prediction/train.py", line 241, in trainer.run() File "prediction/train.py", line 52, in run self.train(start_epoch) File "prediction/train.py", line 107, in train self.train_epoch(epoch) File "prediction/train.py", line 160, in train_epoch losses = self.model.loss(inputs, output) File "/home/hsz/Documents/jupyter/planning/video-gcp/gcp/prediction/models/tree/tree.py", line 73, in loss losses = super().loss(inputs, outputs, log_error_arr) File "/home/hsz/Documents/jupyter/planning/video-gcp/gcp/prediction/models/base_gcp.py", line 286, in loss weights=inputs.pad_mask[:, :reg_len][:, :, None]) File "/home/hsz/Documents/jupyter/planning/video-gcp/blox/torch/losses.py", line 28, in call error = self.compute(*args, **kwargs) * weights File "/home/hsz/Documents/jupyter/planning/video-gcp/blox/torch/losses.py", line 57, in compute l2_loss = torch.nn.MSELoss(reduction='none')(estimates, targets) File "/home/hsz/anaconda3/envs/gcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/hsz/anaconda3/envs/gcp/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 530, in forward return F.mse_loss(input, target, reduction=self.reduction) File "/home/hsz/anaconda3/envs/gcp/lib/python3.7/site-packages/torch/nn/functional.py", line 3279, in mse_loss expanded_input, expanded_target = torch.broadcast_tensors(input, target) File "/home/hsz/anaconda3/envs/gcp/lib/python3.7/site-packages/torch/functional.py", line 73, in broadcast_tensors return _VF.broadcast_tensors(tensors) # type: ignore[attr-defined] RuntimeError: The size of tensor a (255) must match the size of tensor b (200) at non-singleton dimension 1

And also, I use torch 1.12.1+113 since the provided version is too old. I don't know if it is relevant. Thanks.

KyleHuo avatar May 13 '23 03:05 KyleHuo

Hi Kyle, it looks like the target selection for the tree prediction is failing for some reason. I am guessing 200 is the number of images in the ground truth sequence, and 255 is the number of images produced by the tree model. There is some logic that selects the 200 predicted images to apply the loss to but it is somehow failing. It might have to do with the different torch version. Unfortunately I don't have anything else I can add now, but you could try running the sequential (non-tree) model and seeing if that works.

orybkin avatar May 13 '23 18:05 orybkin