DeFeat-Net icon indicating copy to clipboard operation
DeFeat-Net copied to clipboard

A Random RuntimeError : Tensor of 0 Elements

Open zzzh1y opened this issue 2 years ago • 1 comments

Hi, thans for your sharing!

I applied the Monodepth2 training framework you recommended to train on my own dataset. The training can process normally and reasonable depth maps can also be obtained. However, a weird tensor shape mismatch error happened at random -- It does not occur at a specific time such as the first iteration or the end of a epoch, it just appear at any time. It makes me hard to train for multiple epochs. Do you have any clues?

The error is as follows:

Traceback (most recent call last): File "/mnt/Project/DeFeatNet_storz/train.py", line 18, in trainer.train() File "/mnt/Project/DeFeatNet_storz/trainer.py", line 202, in train self.run_epoch() File "/mnt/Project/DeFeatNet_storz/trainer.py", line 218, in run_epoch losses = self.process_batch(inputs) File "/mnt/Project/DeFeatNet_storz/trainer.py", line 275, in process_batch losses = self.deFeatLoss(target_disps, target_features, support_features, poses, File "/root/miniconda3/envs/myconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/Project/DeFeatNet_storz/losses/defeat_loss.py", line 64, in forward l_feat = [self.hca_loss(torch.cat((pred_feats, sf), dim=2), corr) File "/mnt/Project/DeFeatNet_storz/losses/defeat_loss.py", line 64, in l_feat = [self.hca_loss(torch.cat((pred_feats, sf), dim=2), corr) File "/root/miniconda3/envs/myconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/Project/DeFeatNet_storz/losses/hierarchical_context_aggregation.py", line 78, in forward return sum(loss(feat, labels) for feat, loss in zip(feature_chunks, self._losses)) File "/mnt/Project/DeFeatNet_storz/losses/hierarchical_context_aggregation.py", line 78, in return sum(loss(feat, labels) for feat, loss in zip(feature_chunks, self._losses)) File "/root/miniconda3/envs/myconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/Project/DeFeatNet_storz/losses/pixelwise_contrastive.py", line 59, in forward loss = self._positive_loss(source, target, source_kpts, target_kpts)[0] File "/mnt/Project/DeFeatNet_storz/losses/pixelwise_contrastive.py", line 69, in _positive_loss dist = self._calc_distance(source, target, source_kpts, target_kpts) File "/mnt/Project/DeFeatNet_storz/losses/pixelwise_contrastive.py", line 64, in _calc_distance source_descriptors = ops.extract_kpt_vectors(source, source_kpts).permute([0, 2, 1]) File "/mnt/Project/DeFeatNet_storz/utils/ops.py", line 75, in extract_kpt_vectors return tensor[b_num, :, tmp_idx[:, 1], tmp_idx[:, 0]].reshape([batch_size, num_kpts, -1]) RuntimeError: cannot reshape tensor of 0 elements into shape [8, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

zzzh1y avatar Aug 17 '22 04:08 zzzh1y

Hi,

Thanks for the interest!

It seems like it is not finding any correspondences between the images? (since num_kpts is 0). Looking at get_correspondences this might happen if all coordinates are out of bounds or if all pixels in the image are removed by automasking.

Maybe it is worth visualizing the depth map at the state where it fails, or adding a check to ignore the contrastive loss if no correspondences are found.

Kind regards, Jaime

jspenmar avatar Aug 19 '22 08:08 jspenmar