Domain-Adaptation-Regression icon indicating copy to clipboard operation
Domain-Adaptation-Regression copied to clipboard

error

Open ghost opened this issue 3 years ago • 2 comments

In the training, I got this wrong as follows.

tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=<ViewBackward>) The tensor is the feature of source obtained by the model.

Traceback (most recent call last): File "train_rsd.py", line 212, in rsd_loss = RSD(feature_s,feature_t) File "train_rsd.py", line 133, in RSD u_s, s_s, v_s = torch.svd(Feature_s.t()) RuntimeError: svd_cuda: For batch 0: U(37,37) is zero, singular U.

ghost avatar Aug 25 '21 09:08 ghost

In the training, I got this wrong as follows.

tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=) The tensor is the feature of source obtained by the model.

Traceback (most recent call last): File "train_rsd.py", line 212, in rsd_loss = RSD(feature_s,feature_t) File "train_rsd.py", line 133, in RSD u_s, s_s, v_s = torch.svd(Feature_s.t()) RuntimeError: svd_cuda: For batch 0: U(37,37) is zero, singular U.

Do you solve this problem?

ZhaoZhibin avatar Mar 23 '22 12:03 ZhaoZhibin

In the training, I got this wrong as follows. tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=) The tensor is the feature of source obtained by the model. Traceback (most recent call last): File "train_rsd.py", line 212, in rsd_loss = RSD(feature_s,feature_t) File "train_rsd.py", line 133, in RSD u_s, s_s, v_s = torch.svd(Feature_s.t()) RuntimeError: svd_cuda: For batch 0: U(37,37) is zero, singular U.

Do you solve this problem?

I think it is caused by the unstable gradient. But not sure how to avoid it. https://pytorch.org/docs/stable/generated/torch.linalg.svd.html#torch.linalg.svd

xuxu116 avatar Apr 01 '22 12:04 xuxu116