pytorch-deeplab-xception icon indicating copy to clipboard operation
pytorch-deeplab-xception copied to clipboard

RuntimeError: CUDA error: device-side assert triggered

Open ycc66104116 opened this issue 3 years ago • 4 comments

hi, recently i use my own dataset and run Deeplab V3+, but i got the error. i think this is about the classes but i sure that my classes is 6+1(background), and i have changed the number in utils.py and .py. and i really don't know how to fix it. does anyone know this? i will very appreciate if anyone can help me fix this problem.

BTW. i used to run successfully with 2 classes, but when i used other dataset and change to 7, it run out the error. and .py is modified from pascal.py, i only change the class name, num classes and base dir to my dataset. other parts maintain the same as pascal.py.

--------------my error message----------- C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\NLLLoss2d.cu:95: block: [0,0,0], thread: [708,0,0] Assertion t >= 0 && t < n_classes failed. C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\NLLLoss2d.cu:95: block: [0,0,0], thread: [709,0,0] Assertion t >= 0 && t < n_classes failed. C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\NLLLoss2d.cu:95: block: [0,0,0], thread: [710,0,0] Assertion t >= 0 && t < n_classes failed. ... Traceback (most recent call last): File "train.py", line 388, in main() File "train.py", line 374, in main trainer.training(epoch) File "train.py", line 134, in training loss = self.criterion(output, target) File "N:\pytorch-deeplab-xception-master\utils\loss.py", line 28, in CrossEntropyLoss loss = criterion(logit, target.long()) File "C:\Users\LOC\anaconda3\envs\envfordeeplab1229\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\LOC\anaconda3\envs\envfordeeplab1229\lib\site-packages\torch\nn\modules\loss.py", line 1152, in forward label_smoothing=self.label_smoothing) File "C:\Users\LOC\anaconda3\envs\envfordeeplab1229\lib\site-packages\torch\nn\functional.py", line 2846, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) RuntimeError: CUDA error: device-side assert triggered

ycc66104116 avatar Apr 24 '22 04:04 ycc66104116

I have met the same questions as you.Would you deal with it yet? One month ago,I can run it with no problem. But this time,when I git it again, it reported this ERROR, that's so strange.

xieyufei-SLAM avatar May 22 '22 10:05 xieyufei-SLAM

yes i can run the code now, however i haven't try multi classes yet. now my data only contains 1 kind target and background. i processed my label data as indices, which means only 0 and 1 (cause only 2 classes now), the two value in the label image.

ycc66104116 avatar May 30 '22 15:05 ycc66104116

I have met the same questions as you.Would you deal with it yet? One month ago,I can run it with no problem. But this time,when I git it again, it reported this ERROR, that's so strange.

hey, did you fix the problem?

chiba1sonny avatar Oct 18 '22 09:10 chiba1sonny

I have met the same questions as you.Would you deal with it yet? One month ago,I can run it with no problem. But this time,when I git it again, it reported this ERROR, that's so strange.

hello,did you fix the problem?

123Bruceche avatar Nov 15 '22 12:11 123Bruceche