torch._C._LinAlgError: linalg.svd: (Batch element 0): The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values (error code: 15).
C:\Users\LocalAdmin\anaconda3\envs\lightlyyolo\python.exe D:\Charis\SSL-yolo8\lightly-master\examples\pytorch\mmcr_yolo.py
WARNING ⚠️ no model scale passed. Assuming scale='n'.
class_name is: MMCR
save_path is: D:\Charis\SSL-yolo8\lightly-master\runs\MMCR
Starting Training
epoch: 00, loss: -2415920191337764664519950336.00000
after training
tensor([ -0.7926, -2.2815, -0.7858, -14.8213, -16.7507], device='cuda:0')
tensor([ -0.7926, -2.2815, -0.7858, -14.8213, -16.7507], device='cuda:0')
tensor([ -0.7926, -2.2815, -0.7858, -14.8213, -16.7507], device='cuda:0')
tensor([-0.4687, -0.7416, -0.3247, -4.7035, -5.2732], device='cuda:0')
after saving training + has backbone.load_state_dict
tensor([-0.4687, -0.7416, -0.3247, -4.7035, -5.2732], device='cuda:0')
tensor([-0.4687, -0.7416, -0.3247, -4.7035, -5.2732], device='cuda:0')
tensor([-0.4687, -0.7416, -0.3247, -4.7035, -5.2732], device='cuda:0')
tensor([-0.4687, -0.7416, -0.3247, -4.7035, -5.2732], device='cuda:0')
save full_path is: D:\Charis\SSL-yolo8\lightly-master\runs\MMCR\MMCR_coca_alldcm_MMCRTransform.pth
Saving model for MMCR_coca_alldcm_MMCRTransform.pth at Epoch 1
Finding optimal model params. Loss is dropping from -2415920191337764664519950336.0000 to -2415920191337764664519950336.0000
D:\Charis\SSL-yolo8\lightly-master\lightly\loss\mmcr_loss.py:60: UserWarning: torch.linalg.svd: During SVD computation with the selected cusolver driver, batches 0, 1, 2, 3, 4, and other 123 batches failed to converge. A more accurate method will be used to compute the SVD as a fallback. Check doc at https://pytorch.org/docs/stable/generated/torch.linalg.svd.html (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\linalg\BatchLinearAlgebraLib.cpp:703.)
_, S_z, _ = svd(z)
Traceback (most recent call last):
File "D:\Charis\SSL-yolo8\lightly-master\examples\pytorch\mmcr_yolo.py", line 158, in
Process finished with exit code 1
Hi, sorry for the late reply. It looks like your loss is way too large (2415920191337764664519950336.00000). Maybe try decreasing the learning rate or check your gradient values (clip them if necessary).