semantic-segmentation-pytorch icon indicating copy to clipboard operation
semantic-segmentation-pytorch copied to clipboard

CUDA driver version is insufficient for CUDA runtime version

Open KowalskiWang opened this issue 4 years ago • 3 comments

After I ran ./demo_test.sh, returns: [2020-04-22 20:12:02,131 INFO test.py line 172 28455] Loaded configuration file config/ade20k-resnet50dilated-ppm_deepsup.yaml [2020-04-22 20:12:02,131 INFO test.py line 173 28455] Running with config: DATASET: imgMaxSize: 1000 imgSizes: (300, 375, 450, 525, 600) list_train: ./data/training.odgt list_val: ./data/validation.odgt num_class: 150 padding_constant: 8 random_flip: True root_dataset: ./data/ segm_downsampling_rate: 8 DIR: ade20k-resnet50dilated-ppm_deepsup MODEL: arch_decoder: ppm_deepsup arch_encoder: resnet50dilated fc_dim: 2048 weights_decoder: weights_encoder: TEST: batch_size: 1 checkpoint: epoch_20.pth result: ./ TRAIN: batch_size_per_gpu: 2 beta1: 0.9 deep_sup_scale: 0.4 disp_iter: 20 epoch_iters: 5000 fix_bn: False lr_decoder: 0.02 lr_encoder: 0.02 lr_pow: 0.9 num_epoch: 20 optim: SGD seed: 304 start_epoch: 0 weight_decay: 0.0001 workers: 16 VAL: batch_size: 1 checkpoint: epoch_20.pth visualize: False THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=32 error=35 : CUDA driver version is insufficient for CUDA runtime version Traceback (most recent call last): File "test.py", line 198, in <module> main(cfg, args.gpu) File "test.py", line 95, in main torch.cuda.set_device(gpu) File "/home/kai/anaconda3/envs/hrnet/lib/python3.6/site-packages/torch/cuda/__init__.py", line 262, in set_device torch._C._cuda_setDevice(device) RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:32 Pytorch=0.4.1, cuda80, python=3.6. Can anyone help me out?

KowalskiWang avatar Apr 23 '20 04:04 KowalskiWang

Running into same issue

DATASET: imgMaxSize: 1000 imgSizes: (300, 375, 450, 525, 600) list_train: ./data/training.odgt list_val: ./data/validation.odgt num_class: 150 padding_constant: 8 random_flip: True root_dataset: ./data/ segm_downsampling_rate: 8 DIR: ade20k-resnet50dilated-ppm_deepsup MODEL: arch_decoder: ppm_deepsup arch_encoder: resnet50dilated fc_dim: 2048 weights_decoder: weights_encoder: TEST: batch_size: 1 checkpoint: epoch_20.pth result: ./ TRAIN: batch_size_per_gpu: 2 beta1: 0.9 deep_sup_scale: 0.4 disp_iter: 20 epoch_iters: 5000 fix_bn: False lr_decoder: 0.02 lr_encoder: 0.02 lr_pow: 0.9 num_epoch: 20 optim: SGD seed: 304 start_epoch: 0 weight_decay: 0.0001 workers: 16 VAL: batch_size: 1 checkpoint: epoch_20.pth visualize: False Loading weights for net_encoder Loading weights for net_decoder 0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last): File ".\test.py", line 198, in <module> main(cfg, args.gpu) File ".\test.py", line 128, in main test(segmentation_module, loader_test, gpu) File ".\test.py", line 79, in test scores = scores + pred_tmp / len(cfg.DATASET.imgSizes) RuntimeError: CUDA out of memory. Tried to allocate 1.16 GiB (GPU 0; 6.00 GiB total capacity; 3.67 GiB already allocated; 716.63 MiB free; 3.69 GiB reserved in total by PyTorch)

shahidammer avatar Apr 29 '20 10:04 shahidammer

After I ran ./demo_test.sh, returns: [2020-04-22 20:12:02,131 INFO test.py line 172 28455] Loaded configuration file config/ade20k-resnet50dilated-ppm_deepsup.yaml [2020-04-22 20:12:02,131 INFO test.py line 173 28455] Running with config: DATASET: imgMaxSize: 1000 imgSizes: (300, 375, 450, 525, 600) list_train: ./data/training.odgt list_val: ./data/validation.odgt num_class: 150 padding_constant: 8 random_flip: True root_dataset: ./data/ segm_downsampling_rate: 8 DIR: ade20k-resnet50dilated-ppm_deepsup MODEL: arch_decoder: ppm_deepsup arch_encoder: resnet50dilated fc_dim: 2048 weights_decoder: weights_encoder: TEST: batch_size: 1 checkpoint: epoch_20.pth result: ./ TRAIN: batch_size_per_gpu: 2 beta1: 0.9 deep_sup_scale: 0.4 disp_iter: 20 epoch_iters: 5000 fix_bn: False lr_decoder: 0.02 lr_encoder: 0.02 lr_pow: 0.9 num_epoch: 20 optim: SGD seed: 304 start_epoch: 0 weight_decay: 0.0001 workers: 16 VAL: batch_size: 1 checkpoint: epoch_20.pth visualize: False THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=32 error=35 : CUDA driver version is insufficient for CUDA runtime version Traceback (most recent call last): File "test.py", line 198, in <module> main(cfg, args.gpu) File "test.py", line 95, in main torch.cuda.set_device(gpu) File "/home/kai/anaconda3/envs/hrnet/lib/python3.6/site-packages/torch/cuda/__init__.py", line 262, in set_device torch._C._cuda_setDevice(device) RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:32 Pytorch=0.4.1, cuda80, python=3.6. Can anyone help me out?

You need to reinstall drivers and cuda, and check their compatibility while installing. CUDA version must be less than or equal runtime version. This can be checked using nvidia-smi. After installing the drivers, it shows the compatible cuda version beside Driver Version you need to install.

pranay731 avatar Jul 23 '20 07:07 pranay731

Running into same issue

DATASET: imgMaxSize: 1000 imgSizes: (300, 375, 450, 525, 600) list_train: ./data/training.odgt list_val: ./data/validation.odgt num_class: 150 padding_constant: 8 random_flip: True root_dataset: ./data/ segm_downsampling_rate: 8 DIR: ade20k-resnet50dilated-ppm_deepsup MODEL: arch_decoder: ppm_deepsup arch_encoder: resnet50dilated fc_dim: 2048 weights_decoder: weights_encoder: TEST: batch_size: 1 checkpoint: epoch_20.pth result: ./ TRAIN: batch_size_per_gpu: 2 beta1: 0.9 deep_sup_scale: 0.4 disp_iter: 20 epoch_iters: 5000 fix_bn: False lr_decoder: 0.02 lr_encoder: 0.02 lr_pow: 0.9 num_epoch: 20 optim: SGD seed: 304 start_epoch: 0 weight_decay: 0.0001 workers: 16 VAL: batch_size: 1 checkpoint: epoch_20.pth visualize: False Loading weights for net_encoder Loading weights for net_decoder 0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last): File ".\test.py", line 198, in <module> main(cfg, args.gpu) File ".\test.py", line 128, in main test(segmentation_module, loader_test, gpu) File ".\test.py", line 79, in test scores = scores + pred_tmp / len(cfg.DATASET.imgSizes) RuntimeError: CUDA out of memory. Tried to allocate 1.16 GiB (GPU 0; 6.00 GiB total capacity; 3.67 GiB already allocated; 716.63 MiB free; 3.69 GiB reserved in total by PyTorch)

This is out of memory error, different from the above error. I think a restart would most probably solve your problem.

pranay731 avatar Jul 23 '20 07:07 pranay731