unilm icon indicating copy to clipboard operation
unilm copied to clipboard

CUDA device-side assert error

Open varshaneya opened this issue 4 years ago • 5 comments

Model I am using : LayoutLM

During the training, I am getting CUDA device side assert error while performing the forward pass. Attached below is the screenshot of the error:

image

During the evaluation, I am getting CUDA device assert error when I am trying to move data from CPU to GPU. Attached below is screenshot of the error:

image

Pytorch version: 1.6.0 CUDA toolkit: 10.1 OS: Ubuntu 16.04 Python: 3.6.10 Anaconda

Can somebody help me out as to why is this error coming up?

varshaneya avatar Sep 07 '20 12:09 varshaneya

@varshaneya, which GPU do you use?

wolfshow avatar Sep 11 '20 01:09 wolfshow

@wolfshow I use NVIDIA-DGX.

varshaneya avatar Sep 11 '20 09:09 varshaneya

@varshaneya this error message is quite vague. When I get it I try to run on CPU and then the error message is more clear

ruifcruz avatar Nov 14 '20 16:11 ruifcruz

Model I am using : LayoutLM

During the training, I am getting CUDA device side assert error while performing the forward pass. Attached below is the screenshot of the error:

image

During the evaluation, I am getting CUDA device assert error when I am trying to move data from CPU to GPU. Attached below is screenshot of the error:

image

Pytorch version: 1.6.0 CUDA toolkit: 10.1 OS: Ubuntu 16.04 Python: 3.6.10 Anaconda

Can somebody help me out as to why is this error coming up?

try running the same script with CUDA_LAUNCH_BLOCKING=1 at the start, it will give us more information on the actual issue.

sreejith3534 avatar Dec 07 '20 07:12 sreejith3534

@varshaneya , where u able to resolve the above issue ?

jyotiyadav94 avatar Jun 21 '22 09:06 jyotiyadav94