PaddleSeg
PaddleSeg copied to clipboard
使用自己的数据集训练U2NET,训练到中途,抛出错误, 'cudaErrorLaunchFailure'. 求教大佬!!
问题确认 Search before asking
- [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.
请提出你的问题 Please ask your question
使用自己标注的数据集训练,使用U2NET模型训练,训练 iter==100时,cuda抛出错误:
2023-11-30 21:05:47 [INFO] [TRAIN] epoch: 3, iter: 390/1000, loss: nan, lr: 0.006419, batch_cost: 0.4377, reader_cost: 0.00000, ips: 4.5697 samples/sec | ETA 00:04:26 Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertion
falsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Traceback (most recent call last): File "C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\tools\train.py", line 213, inmain(args) File "C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\tools\train.py", line 187, in main train( File "C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\paddleseg\core\train.py", line 266, in train avg_loss += float(loss) File "D:\Anaconda\envs\paddle_env\lib\site-packages\paddle\fluid\dygraph\math_op_patch.py", line 117, in float return float(np.array(var).flatten()[0]) File "D:\Anaconda\envs\paddle_env\lib\site-packages\paddle\fluid\dygraph\tensor_patch_methods.py", line 696, in array array = self.numpy(False) OSError: (External) CUDA error(719), unspecified launch failure. [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ..\paddle\phi\backends\gpu\cuda\cuda_info.cc:267)
环境:RTX4070 python==3.8 CUDA==11.8 cuDNN==8.9 paddlepaddle==2.5.2
配置文件config:
_base_: '../_base_/cityscapes.yml'
batch_size: 2
iters: 1000
#iters: 160000
train_dataset:
type: Dataset
num_classes: 2
dataset_root: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil
train_path: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil\train.txt
mode: train
val_dataset:
type: Dataset
num_classes: 2
dataset_root: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil
val_path: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil\val.txt
mode: val
optimizer:
type: SGD
momentum: 0.9
weight_decay: 4.0e-5
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.01
end_lr: 0
power: 0.9
model:
type: U2Net
num_classes: 2
pretrained: Null
loss:
coef: [1, 1, 1, 1, 1, 1, 1]
你好,你的类别为2,但是数据中的标签存在2这个标签,实际上应该只有0,1标签。
Search before asking
- [x] I have searched the question and found no related answer.
Please ask your question
When training with a self-labeled dataset and the U2NET model, cuda throws an error when training iter==100:
2023-11-30 21:05:47 [INFO] [TRAIN] epoch: 3, iter: 390/1000, loss: nan, lr: 0.006419, batch_cost: 0.4377, reader_cost: 0.00000, ips: 4.5697 samples/sec | ETA 00:04:26 Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertion
falsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Error: C:\home\workspace\Paddle\paddle\phi\kernels\gpu\cross_entropy_kernel.cu:1010 Assertionfalsefailed. The value of label expected >= 0 and < 2, or == 255, but got 2. Please check label value. Traceback (most recent call last): File "C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\tools\train.py", line 213, in main(args) File "C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\tools\train.py", line 187, in main train( File "C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\paddleseg\core\train.py", line 266, in train avg_loss += float(loss) File "D:\Anaconda\envs\paddle_env\lib\site-packages\paddle\fluid\dygraph\math_op_patch.py", line 117, in float return float(np.array(var).flatten()[0]) File "D:\Anaconda\envs\paddle_env\lib\site-packages\paddle\fluid\dygraph\tensor_patch_methods.py", line 696, in array array = self.numpy(False) OSError: (External) CUDA error(719), unspecified launch failure. [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ..\paddle\phi\backends\gpu\cuda\cuda_info.cc:267)Specifications:RTX4070 python==3.8 CUDA==11.8 cuDNN==8.9 paddlepaddle==2.5.2
Configuration file config:
_base_: '../_base_/cityscapes.yml' batch_size: 2 iters: 1000 #iters: 160000 train_dataset: type: Dataset num_classes: 2 dataset_root: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil train_path: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil\train.txt mode: train val_dataset: type: Dataset num_classes: 2 dataset_root: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil val_path: C:\Users\delight\Downloads\PaddleSeg-develop\PaddleSeg-develop\custom_data\Oil\val.txt mode: val optimizer: type: SGD momentum: 0.9 weight_decay: 4.0e-5 lr_scheduler: type: PolynomialDecay learning_rate: 0.01 end_lr: 0 power: 0.9 model: type: U2Net num_classes: 2 pretrained: Null loss: coef: [1, 1, 1, 1, 1, 1, 1]
This helped me in setting up my own custom dataset, thanks man