unilm
unilm copied to clipboard
CUDA device-side assert error
Model I am using : LayoutLM
During the training, I am getting CUDA device side assert error while performing the forward pass. Attached below is the screenshot of the error:
During the evaluation, I am getting CUDA device assert error when I am trying to move data from CPU to GPU. Attached below is screenshot of the error:
Pytorch version: 1.6.0 CUDA toolkit: 10.1 OS: Ubuntu 16.04 Python: 3.6.10 Anaconda
Can somebody help me out as to why is this error coming up?
@varshaneya, which GPU do you use?
@wolfshow I use NVIDIA-DGX.
@varshaneya this error message is quite vague. When I get it I try to run on CPU and then the error message is more clear
Model I am using : LayoutLM
During the training, I am getting CUDA device side assert error while performing the forward pass. Attached below is the screenshot of the error:
During the evaluation, I am getting CUDA device assert error when I am trying to move data from CPU to GPU. Attached below is screenshot of the error:
Pytorch version: 1.6.0 CUDA toolkit: 10.1 OS: Ubuntu 16.04 Python: 3.6.10 Anaconda
Can somebody help me out as to why is this error coming up?
try running the same script with CUDA_LAUNCH_BLOCKING=1 at the start, it will give us more information on the actual issue.
@varshaneya , where u able to resolve the above issue ?