UltraLight-VM-UNet
UltraLight-VM-UNet copied to clipboard
train error
Excellent work. I installed the corresponding packages and attempted to train with a different dataset, resizing the images into 256-sized blocks, and then encountered the following issue. Could you please tell me if there is any problem here:
#----------Creating logger----------#
#----------GPU init----------#
#----------Preparing dataset----------#
#----------Prepareing Models----------#
SC_Att_Bridge was used
#----------Prepareing loss, opt, sch and amp----------#
#----------Set other params----------#
#----------Training----------#
torch.Size([8, 3, 256, 256])
x: torch.Size([8, 24, 32, 32])
x1: torch.Size([8, 1024, 6])
Traceback (most recent call last):
File "train.py", line 189, in
Invoked with: tensor([[[-3.6921e-02, -2.2584e-02, -2.5043e-02, ..., -5.4228e-02, -5.4027e-02, -5.1787e-02], [-6.3227e-02, -3.2376e-02, -3.7078e-02, ..., -8.5041e-02, -8.5496e-02, -8.1049e-02], [-4.1870e-03, -8.7852e-03, -8.0635e-03, ..., -4.1606e-02, -4.0918e-02, -4.1173e-02], ...,
[ 1.3810e-01, 1.2920e-01, 1.2920e-01, ..., 1.4433e-01,
1.4433e-01, 1.6126e-01],
[ 1.4266e-02, 1.4064e-02, 1.4064e-02, ..., 1.1132e-02,
1.1132e-02, 9.8532e-03],
[-7.0302e-02, -6.6612e-02, -6.6612e-02, ..., -7.2660e-02,
-7.2660e-02, -7.9289e-02]]], device='cuda:0', requires_grad=True), tensor([[-0.2176, -0.1239, -0.0767, -0.1056],
[-0.0190, -0.1246, 0.3363, -0.1871],
[ 0.1523, -0.0473, 0.0405, -0.5286],
[ 0.0705, -0.1187, 0.0597, 0.0934],
[-0.2788, 0.0680, -0.1250, -0.1106],
[ 0.3183, -0.1641, 0.3027, 0.0206],
[ 0.3087, -0.3258, -0.2065, 0.2467],
[ 0.3808, 0.1227, -0.1961, -0.4432],
[-0.4132, -0.0891, -0.0532, 0.0154],
[ 0.0185, -0.1335, -0.2039, 0.0383],
[ 0.1164, 0.0900, -0.0019, -0.1997],
[ 0.0422, -0.3562, -0.0239, -0.1291]], device='cuda:0',
requires_grad=True), Parameter containing:
tensor([-0.0774, 0.0933, 0.1647, -0.1945, 0.3946, -0.0037, 0.0410, -0.4760, 0.0619, 0.1716, 0.0697, -0.0496], device='cuda:0', requires_grad=True), None, None, None, True
Hi, based on your error message and past questions asked, this is mostly an issue with data preparation, resulting in the data not being entered correctly into the '.npy' file. We recommend that you preprocess your data according to the 'Prepare your own dataset' section. Alternatively, it is recommended that you first try to reproduce it on the ISIC2017 dataset (2000 images), which will allow you to rule out whether it is a matching issue with the environment and hardware.
Thank you for your feedback. The issue was resolved after I reinstalled mamba_ssm==1.0.1; the previous version was 1.2.0. I have modified the loader.py to read a different dataset according to the task, but I am currently somewhat puzzled because, after training for multiple epochs on my task, the loss hardly decreases.
Try to check the output and the DSC of the final result in the 'Output' folder. Also, try to check if the masks of the modified loader.py final output model are normalized.
Thank you for your feedback. The issue was resolved after I reinstalled mamba_ssm==1.0.1; the previous version was 1.2.0. I have modified the loader.py to read a different dataset according to the task, but I am currently somewhat puzzled because, after training for multiple epochs on my task, the loss hardly decreases.
Hello, have you solved the problem of loss not decreasing? If you use a custom dataset, how should you process the data to achieve the purpose of training?