H-vmunet Train

I used the breast cancer dataset BUSI for training and pre-processed the data, but I encountered a problem during training #----------Training----------# Traceback (most recent call last): File "train.py", line 188, in main(config) File "train.py", line 131, in main train_one_epoch( File "/mnt/data/panxue/PX/H-vmunet-main/engine.py", line 38, in train_one_epoch out = model(images) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/mnt/data/panxue/PX/H-vmunet-main/models/H_vmunet.py", line 381, in forward out = F.gelu(F.max_pool2d(self.ebn1(self.encoder1(x)),2,2)) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward input = module(input) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: no valid convolution algorithms available in CuDNN

Apr 13 '24 09:04 Ystartff

Hi, based on your error message, we believe that there was an issue when you were preparing the data that caused a mismatch of data values in the first layer of the input convolution. You should be able to get help from that issue. The number of training, validation and test sets should be accurately divided and formatted correctly to be imported into the '.npy' file. Best regards.

Apr 13 '24 12:04 wurenkai

I handled the data preprocessing, and but I printed the size before it was originally passed into the model as x torch.Size([8, 3, 256, 256]).The 647 at the beginning represents the number of my BUSI datasets, and I completed the process according to 6:2:2, but still met the following problem.I changed num_classes to 2, but found that the following happens when set to 1 as well 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 Reading your dataset finished (tool) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ python train.py Traceback (most recent call last): File "train.py", line 1, in import torch ModuleNotFoundError: No module named 'torch' (tool) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ conda activate vmunet (vmunet) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ python train.py #----------Creating logger----------# #----------GPU init----------# #----------Preparing dataset----------# #----------Prepareing Models----------# [H_SS2D] 2 order with dims= [8, 16] scale=0.3333 [H_SS2D] 2 order with dims= [8, 16] scale=0.3333 [H_SS2D] 3 order with dims= [8, 16, 32] scale=0.3333 [H_SS2D] 3 order with dims= [8, 16, 32] scale=0.3333 [H_SS2D] 4 order with dims= [8, 16, 32, 64] scale=0.3333 [H_SS2D] 4 order with dims= [8, 16, 32, 64] scale=0.3333 [H_SS2D] 5 order with dims= [8, 16, 32, 64, 128] scale=0.3333 [H_SS2D] 5 order with dims= [8, 16, 32, 64, 128] scale=0.3333 SC_Att_Bridge was used [H_SS2D] 5 order with dims= [16, 32, 64, 128, 256] scale=0.3333 [H_SS2D] 5 order with dims= [16, 32, 64, 128, 256] scale=0.3333 [H_SS2D] 4 order with dims= [16, 32, 64, 128] scale=0.3333 [H_SS2D] 4 order with dims= [16, 32, 64, 128] scale=0.3333 [H_SS2D] 3 order with dims= [16, 32, 64] scale=0.3333 [H_SS2D] 3 order with dims= [16, 32, 64] scale=0.3333 [H_SS2D] 2 order with dims= [16, 32] scale=0.3333 [H_SS2D] 2 order with dims= [16, 32] scale=0.3333 #----------Prepareing loss, opt, sch and amp----------# #----------Set other params----------# #----------Training----------# Traceback (most recent call last): File "train.py", line 188, in main(config) File "train.py", line 131, in main train_one_epoch( File "/mnt/data/panxue/PX/H-vmunet-main/engine.py", line 38, in train_one_epoch out = model(images) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/mnt/data/panxue/PX/H-vmunet-main/models/H_vmunet.py", line 381, in forward out = F.gelu(F.max_pool2d(self.ebn1(self.encoder1(x)),2,2)) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward input = module(input) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: no valid convolution algorithms available in CuDNN

Apr 13 '24 16:04 Ystartff

I handled the data preprocessing, and but I printed the size before it was originally passed into the model as x torch.Size([8, 3, 256, 256]).The 647 at the beginning represents the number of my BUSI datasets, and I completed the process according to 6:2:2, but still met the following problem.I changed num_classes to 2, but found that the following happens when set to 1 as well 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 Reading your dataset finished (tool) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ python train.py Traceback (most recent call last): File "train.py", line 1, in import torch ModuleNotFoundError: No module named 'torch' (tool) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ conda activate vmunet (vmunet) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ python train.py #----------Creating logger----------# #----------GPU init----------# #----------Preparing dataset----------# #----------Prepareing Models----------# [H_SS2D] 2 order with dims= [8, 16] scale=0.3333 [H_SS2D] 2 order with dims= [8, 16] scale=0.3333 [H_SS2D] 3 order with dims= [8, 16, 32] scale=0.3333 [H_SS2D] 3 order with dims= [8, 16, 32] scale=0.3333 [H_SS2D] 4 order with dims= [8, 16, 32, 64] scale=0.3333 [H_SS2D] 4 order with dims= [8, 16, 32, 64] scale=0.3333 [H_SS2D] 5 order with dims= [8, 16, 32, 64, 128] scale=0.3333 [H_SS2D] 5 order with dims= [8, 16, 32, 64, 128] scale=0.3333 SC_Att_Bridge was used [H_SS2D] 5 order with dims= [16, 32, 64, 128, 256] scale=0.3333 [H_SS2D] 5 order with dims= [16, 32, 64, 128, 256] scale=0.3333 [H_SS2D] 4 order with dims= [16, 32, 64, 128] scale=0.3333 [H_SS2D] 4 order with dims= [16, 32, 64, 128] scale=0.3333 [H_SS2D] 3 order with dims= [16, 32, 64] scale=0.3333 [H_SS2D] 3 order with dims= [16, 32, 64] scale=0.3333 [H_SS2D] 2 order with dims= [16, 32] scale=0.3333 [H_SS2D] 2 order with dims= [16, 32] scale=0.3333 #----------Prepareing loss, opt, sch and amp----------# #----------Set other params----------# #----------Training----------# Traceback (most recent call last): File "train.py", line 188, in main(config) File "train.py", line 131, in main train_one_epoch( File "/mnt/data/panxue/PX/H-vmunet-main/engine.py", line 38, in train_one_epoch out = model(images) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/mnt/data/panxue/PX/H-vmunet-main/models/H_vmunet.py", line 381, in forward out = F.gelu(F.max_pool2d(self.ebn1(self.encoder1(x)),2,2)) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward input = module(input) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: no valid convolution algorithms available in CuDNN

This is my modified data preprocessing

train_number = 389 val_number = 129
test_number = 129
all = int(train_number) + int(val_number) + int(test_number) image_dir = "/mnt/data/panxue/PX/H-vmunet-main/data/images" Tr_list = [os.path.join(image_dir, f) for f in os.listdir(image_dir) if f.endswith('.png')] Data_train_2018 = np.zeros([all, height, width, channels]) Label_train_2018 = np.zeros([all, height, width]) print('Reading') print(len(Tr_list)) for idx in range(len(Tr_list)): print(idx+1) img = sc.imread(Tr_list[idx]) img = np.double(sc.imresize(img, [height, width, channels], interp='bilinear', mode = 'RGB')) Data_train_2018[idx, :,:,:] = img b = Tr_list[idx] b = b[b.rfind('/') + 1:len(b) - 4] add = ("/mnt/data/panxue/PX/H-vmunet-main/data/masks/" + b +'_mask.png') img2 = sc.imread(add) img2 = np.double(sc.imresize(img2, [height, width], interp='bilinear')) Label_train_2018[idx, :,:] = img2
print('Reading your dataset finished')

Apr 13 '24 16:04 Ystartff

print('Train_img shape:', Train_img.shape) print('Validation_img shape:', Validation_img.shape) print('Test_img shape:', Test_img.shape) print('Train_mask shape:', Train_mask.shape) print('Validation_mask shape:', Validation_mask.shape) print('Test_mask shape:', Test_mask.shape)

Train_img shape: (389, 256, 256, 3) Validation_img shape: (129, 256, 256, 3) Test_img shape: (129, 256, 256, 3) Train_mask shape: (389, 256, 256) Validation_mask shape: (129, 256, 256) Test_mask shape: (129, 256, 256)

I handled the data preprocessing, and but I printed the size before it was originally passed into the model as x torch.Size([8, 3, 256, 256]).The 647 at the beginning represents the number of my BUSI datasets, and I completed the process according to 6:2:2, but still met the following problem.I changed num_classes to 2, but found that the following happens when set to 1 as well 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 Reading your dataset finished (tool) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ python train.py Traceback (most recent call last): File "train.py", line 1, in import torch ModuleNotFoundError: No module named 'torch' (tool) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ conda activate vmunet (vmunet) panxue@user-NF5280M5:/mnt/data/panxue/PX/H-vmunet-main$ python train.py #----------Creating logger----------# #----------GPU init----------# #----------Preparing dataset----------# #----------Prepareing Models----------# [H_SS2D] 2 order with dims= [8, 16] scale=0.3333 [H_SS2D] 2 order with dims= [8, 16] scale=0.3333 [H_SS2D] 3 order with dims= [8, 16, 32] scale=0.3333 [H_SS2D] 3 order with dims= [8, 16, 32] scale=0.3333 [H_SS2D] 4 order with dims= [8, 16, 32, 64] scale=0.3333 [H_SS2D] 4 order with dims= [8, 16, 32, 64] scale=0.3333 [H_SS2D] 5 order with dims= [8, 16, 32, 64, 128] scale=0.3333 [H_SS2D] 5 order with dims= [8, 16, 32, 64, 128] scale=0.3333 SC_Att_Bridge was used [H_SS2D] 5 order with dims= [16, 32, 64, 128, 256] scale=0.3333 [H_SS2D] 5 order with dims= [16, 32, 64, 128, 256] scale=0.3333 [H_SS2D] 4 order with dims= [16, 32, 64, 128] scale=0.3333 [H_SS2D] 4 order with dims= [16, 32, 64, 128] scale=0.3333 [H_SS2D] 3 order with dims= [16, 32, 64] scale=0.3333 [H_SS2D] 3 order with dims= [16, 32, 64] scale=0.3333 [H_SS2D] 2 order with dims= [16, 32] scale=0.3333 [H_SS2D] 2 order with dims= [16, 32] scale=0.3333 #----------Prepareing loss, opt, sch and amp----------# #----------Set other params----------# #----------Training----------# Traceback (most recent call last): File "train.py", line 188, in main(config) File "train.py", line 131, in main train_one_epoch( File "/mnt/data/panxue/PX/H-vmunet-main/engine.py", line 38, in train_one_epoch out = model(images) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/mnt/data/panxue/PX/H-vmunet-main/models/H_vmunet.py", line 381, in forward out = F.gelu(F.max_pool2d(self.ebn1(self.encoder1(x)),2,2)) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward input = module(input) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/panxue/anaconda3/envs/vmunet/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: no valid convolution algorithms available in CuDNN

This is my modified data preprocessing

train_number = 389 val_number = 129 test_number = 129 all = int(train_number) + int(val_number) + int(test_number) image_dir = "/mnt/data/panxue/PX/H-vmunet-main/data/images" Tr_list = [os.path.join(image_dir, f) for f in os.listdir(image_dir) if f.endswith('.png')] Data_train_2018 = np.zeros([all, height, width, channels]) Label_train_2018 = np.zeros([all, height, width]) print('Reading') print(len(Tr_list)) for idx in range(len(Tr_list)): print(idx+1) img = sc.imread(Tr_list[idx]) img = np.double(sc.imresize(img, [height, width, channels], interp='bilinear', mode = 'RGB')) Data_train_2018[idx, :,:,:] = img b = Tr_list[idx] b = b[b.rfind('/') + 1:len(b) - 4] add = ("/mnt/data/panxue/PX/H-vmunet-main/data/masks/" + b +'_mask.png') img2 = sc.imread(add) img2 = np.double(sc.imresize(img2, [height, width], interp='bilinear')) Label_train_2018[idx, :,:] = img2 print('Reading your dataset finished')

Apr 13 '24 16:04 Ystartff

Hi, we followed your steps and performed the preprocessing of 647 images (images are 24-bit png, masks are 8-bit png (0 pixels for background, 255 pixels for target)). We ran the training H-vmunet normally. We recommend that you check the data '.npy' file and generate an image from the '.npy' file to check if it is correct. Generate image from '.npy' (32-bit png image) Code:

import matplotlib.pyplot as plt
import numpy as np
import scipy.misc
import os
from skimage.transform import resize
from PIL import Image
 
file_dir = r"train_mask.npy"  # npy file address.
dest_dir = r"./label_train" # Generate png images save address.
 
def npy_png(file_dir, dest_dir):
    if not os.path.exists(file_dir):
        os.makedirs(file_dir)
    if not os.path.exists(dest_dir):
        os.makedirs(dest_dir)
 
    #file = file_dir #+ 'name.npy'  
    con_arr = np.load(file_dir)  
    for i in range(0, 9999):  
        arr = con_arr[i, :, :]  
        disp_to_img = resize(arr, [256, 256])  
        a = format(i, '04d')
        disp_to_img = disp_to_img.astype('uint8')
        plt.imsave(os.path.join(dest_dir, a+".png"), disp_to_img, cmap='gray')  # Select 'gray' for masks. 'plasma' for images.
        print('photo a finished')
 
 
if __name__ == "__main__":
    npy_png(file_dir, dest_dir)

Apr 13 '24 17:04 wurenkai

Also, may I ask if you have ever had a dataset that we provided (e.g. ISIC2017 dataset) run properly? This might give you a template to prepare your own dataset, while eliminating system hardware and environment issues.

Apr 13 '24 17:04 wurenkai

Thanks to the author's answer, we carefully studied the delineated files and found that the problem was in the environment, we reinstalled the environment to solve the problem

Apr 15 '24 09:04 Ystartff

H-vmunet H-vmunet copied to clipboard

Train

H-vmunet
H-vmunet copied to clipboard