Hi, thans for your sharing!
I applied the Monodepth2 training framework you recommended to train on my own dataset. The training can process normally and reasonable depth maps can also be obtained. However, a weird tensor shape mismatch error happened at random -- It does not occur at a specific time such as the first iteration or the end of a epoch, it just appear at any time. It makes me hard to train for multiple epochs. Do you have any clues?
The error is as follows:
Traceback (most recent call last):
File "/mnt/Project/DeFeatNet_storz/train.py", line 18, in
trainer.train()
File "/mnt/Project/DeFeatNet_storz/trainer.py", line 202, in train
self.run_epoch()
File "/mnt/Project/DeFeatNet_storz/trainer.py", line 218, in run_epoch
losses = self.process_batch(inputs)
File "/mnt/Project/DeFeatNet_storz/trainer.py", line 275, in process_batch
losses = self.deFeatLoss(target_disps, target_features, support_features, poses,
File "/root/miniconda3/envs/myconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/Project/DeFeatNet_storz/losses/defeat_loss.py", line 64, in forward
l_feat = [self.hca_loss(torch.cat((pred_feats, sf), dim=2), corr)
File "/mnt/Project/DeFeatNet_storz/losses/defeat_loss.py", line 64, in
l_feat = [self.hca_loss(torch.cat((pred_feats, sf), dim=2), corr)
File "/root/miniconda3/envs/myconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/Project/DeFeatNet_storz/losses/hierarchical_context_aggregation.py", line 78, in forward
return sum(loss(feat, labels) for feat, loss in zip(feature_chunks, self._losses))
File "/mnt/Project/DeFeatNet_storz/losses/hierarchical_context_aggregation.py", line 78, in
return sum(loss(feat, labels) for feat, loss in zip(feature_chunks, self._losses))
File "/root/miniconda3/envs/myconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/Project/DeFeatNet_storz/losses/pixelwise_contrastive.py", line 59, in forward
loss = self._positive_loss(source, target, source_kpts, target_kpts)[0]
File "/mnt/Project/DeFeatNet_storz/losses/pixelwise_contrastive.py", line 69, in _positive_loss
dist = self._calc_distance(source, target, source_kpts, target_kpts)
File "/mnt/Project/DeFeatNet_storz/losses/pixelwise_contrastive.py", line 64, in _calc_distance
source_descriptors = ops.extract_kpt_vectors(source, source_kpts).permute([0, 2, 1])
File "/mnt/Project/DeFeatNet_storz/utils/ops.py", line 75, in extract_kpt_vectors
return tensor[b_num, :, tmp_idx[:, 1], tmp_idx[:, 0]].reshape([batch_size, num_kpts, -1])
RuntimeError: cannot reshape tensor of 0 elements into shape [8, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous
Hi,
Thanks for the interest!
It seems like it is not finding any correspondences between the images? (since num_kpts
is 0).
Looking at get_correspondences this might happen if all coordinates are out of bounds or if all pixels in the image are removed by automasking.
Maybe it is worth visualizing the depth map at the state where it fails, or adding a check to ignore the contrastive loss if no correspondences are found.
Kind regards,
Jaime