DKM icon indicating copy to clipboard operation
DKM copied to clipboard

torch.linalg.inv

Open wtishere opened this issue 1 year ago • 6 comments

Thanks for your excellent work! May I ask for one possible solution for the problem shown as below? Thank you so much!

Traceback (most recent call last): File "experiments/dkm/train_DKMv3_outdoor.py", line 259, in train(args) File "experiments/dkm/train_DKMv3_outdoor.py", line 250, in train wandb.log(megadense_benchmark.benchmark(model)) File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/benchmarks/megadepth_dense_benchmark.py", line 72, in benchmark matches, certainty = model.match(im1, im2, batched=True) File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 695, in match dense_corresps = self.forward(batch, batched = True) File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 631, in forward dense_corresps = self.decoder(f_q_pyramid, f_s_pyramid) File "/mnt/data-disk-1/home/cpii.local/wtwang/miniconda3/envs/im/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 494, in forward new_stuff = self.gps[new_scale](f1_s, f2_s, dense_flow=dense_flow) File "/mnt/data-disk-1/home/cpii.local/wtwang/miniconda3/envs/im/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 360, in forward K_yy_inv = torch.linalg.inv(K_yy + sigma_noise) torch._C._LinAlgError: linalg.inv: (Batch element 0): The diagonal element 512 is zero, the inversion could not be completed because the input matrix is singular.

wtishere avatar Oct 10 '23 08:10 wtishere

I've never had this happen before. It should mean that the features from the encoder are extremely correlated. Is it a weird image pair?

Parskatt avatar Oct 10 '23 08:10 Parskatt

Thanks for your reply. I have no idea whether it has a weird image. I used the megadepth dataset and followed your steps to form data structure. Do you have any idea to solve this problem?

wtishere avatar Oct 11 '23 03:10 wtishere

This seems to happen during the benchmark. You should be able to see the names of the images being sent in. If so I can check if Im able to reproduce the issue.

Otherwise Im not sure how to help.

Parskatt avatar Oct 11 '23 05:10 Parskatt

I have encountered the same Error, also used megadepth for training. But I got something new: the loss value became 0 at some step, and then the Error raised. I think the loss value leads to the Error. But I do not know why loss value became zero suddenly. The most wired thing is that: I run the code twice. At the first run , it goes without any error. But at the second run (the exactly same code and devices), the loss became 0 and there raised an Error: torch._C._LinAlgError: torch.linalg.inv: (Batch element 0): The diagonal element 1 is zero, the inversion could not be completed because the input matrix is singular.

MantangGuo avatar Jan 31 '24 14:01 MantangGuo

If its the megadepth training set we don't use seeds so it might be different image pairs etc. You might reduce the risk of this happening by increasing the diagonal term that we add here https://github.com/Parskatt/DKM/blob/d1cba4bf1c96de1497aba031f709013404c08d5f/dkm/models/dkm.py#L355

Parskatt avatar Jan 31 '24 14:01 Parskatt

Thanks for your quick reply. I will try to increase the diagonal term to reduce the risk.

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: Johan Edstedt @.> 发送时间: Wednesday, January 31, 2024 10:57:46 PM 收件人: Parskatt/DKM @.> 抄送: MantangGuo @.>; Comment @.> 主题: Re: [Parskatt/DKM] torch.linalg.inv (Issue #45)

If its the megadepth training set we don't use seeds so it might be different image pairs etc. You might reduce the risk of this happening by increasing the diagonal term that we add here https://github.com/Parskatt/DKM/blob/d1cba4bf1c96de1497aba031f709013404c08d5f/dkm/models/dkm.py#L355

― Reply to this email directly, view it on GitHubhttps://github.com/Parskatt/DKM/issues/45#issuecomment-1919274355, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOC6HZTE6HJBKMFTCQOCFSLYRJLWVAVCNFSM6AAAAAA5Z6APC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJZGI3TIMZVGU. You are receiving this because you commented.Message ID: @.***>

MantangGuo avatar Jan 31 '24 15:01 MantangGuo