UltraLight-VM-UNet icon indicating copy to clipboard operation
UltraLight-VM-UNet copied to clipboard

train error

Open pengzinuo opened this issue 10 months ago • 2 comments

作者您好 我在运行train时发生了以下错误| ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [32,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [33,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [34,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [35,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [36,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [37,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [38,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [39,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [40,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [41,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [42,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [43,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [44,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [45,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [46,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [47,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [48,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [49,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [50,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [51,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [52,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [53,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [54,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [55,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [56,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [57,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [58,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [59,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [60,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [61,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [62,0,0] Assertion input_val >= zero && input_val <= one failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [32,0,0], thread: [63,0,0] Assertion input_val >= zero && input_val <= one failed. Traceback (most recent call last): File "/root/UltraLight-VM-UNet-main/train.py", line 189, in main(config) File "/root/UltraLight-VM-UNet-main/train.py", line 132, in main train_one_epoch( File "/root/UltraLight-VM-UNet-main/engine.py", line 40, in train_one_epoch loss.backward() File "/usr/local/miniconda3/envs/vmunet/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward torch.autograd.backward( File "/usr/local/miniconda3/envs/vmunet/lib/python3.8/site-packages/torch/autograd/init.py", line 197, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: Unable to find a valid cuDNN algorithm to run convolution 麻烦请问一下有什么办法可以解决吗?如果您能帮助我,我将不胜感激!

pengzinuo avatar Apr 21 '24 13:04 pengzinuo

Hi, you can check issue 6 for a similar question. I hope this will help you.

wurenkai avatar Apr 21 '24 14:04 wurenkai

这个问题是由于标签问题导致的,一般就是你的数据集标签越界问题,比如你的标签一般是0,1,2,255,当出现了255的时候就会引发这种问题

jiaweichaojwc avatar May 07 '24 06:05 jiaweichaojwc