Pointnet2_PyTorch
Pointnet2_PyTorch copied to clipboard
CUDA kernel failed: invalid device function
train/test produce the following error:
Validation sanity check: 0it [00:00, ?it/s]CUDA kernel failed : invalid device function
void furthest_point_sampling_kernel_wrapper(int, int, int, const float*, float*, int*) at L:228 in pointnet2_ops/_ext-src/src/sampling_gpu.cu
I have run the training and test before successfully on this machine and cannot work out why now it fails - I do not think I have changed anything about the environment.
output of nvidia-smi
:
Sun Jul 5 13:14:30 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.40.04 Driver Version: 418.40.04 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:1E.0 Off | 0 |
| N/A 36C P0 24W / 300W | 0MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
import torch
torch.version.cuda
> 10.1
torch.cuda.is_available()
> True
versions ($ conda list
):
pointnet2 3.0.0 <pip>
pointnet2-ops 3.0.0 <pip>
python 3.7.7 hcff3b4d_5
pytorch-lightning 0.7.6 <pip>
(I have also tried changing to other versions of pytorch-lightning - 0.7.1/0.84).
Any help greatly appreciated.
Please make sure your nvcc version is also CUDA 10.1. You can check with nvcc --version
.
thanks for reply, I have
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
I guess this is the problem?
Yeah, things should work if you install at 10.0 version of pytorch or can get 10.1 compilation tools
I installed 10.1 compilation tools with $ conda install cudatoolkit-dev -c conda-forge
, so I now have
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
I also reinstalled the pointnet2-ops package with
$ pip install --user --force-reinstall --ignore-installed --no-binary :all: pointnet2_ops_lib
but I am still getting the same error.
I have also tried installing the pytorch version with cu100 using $ conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
but am getting the same error.
I have met the same problem mentioned here, and the solution I took is to reinstall the pytorch and downgrade to 1.4
version with consistent cuda version 10.0 (as the same as the version from nvcc -V
), then reinstalled the pointnet2-ops package. Finally, the error was gone away.
Python version: 3.6.12 Pytorch version: 1.4.0 Cuda version: 10.0
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.0 -c pytorch
Hope my solution can help someone encountering this same issue.
I have met the same problem mentioned here, and the solution I took is to reinstall the pytorch and downgrade to
1.4
version with consistent cuda version 10.0 (as the same as the version fromnvcc -V
), then reinstalled the pointnet2-ops package. Finally, the error was gone away.Python version: 3.6.12 Pytorch version: 1.4.0 Cuda version: 10.0
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.0 -c pytorch
Hope my solution can help someone encountering this same issue.
I tried this method and it works correctly, thanks! I think this bug may result from cudatoolkit version. It seems that cudatoolkit=10.0 works but cudatoolkit=10.1 doesn't work.
i have the same problem. which version of cuda or nvcc or torch should i use?
nvidia-smi +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 On | N/A | | 32% 29C P8 27W / 350W | 1199MiB / 24576MiB | 5% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| +---------------------------------------------------------------------------------------+
nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Aug_15_22:02:13_PDT_2023 Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0
**** please refor https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html to for Minimum Required Driver Version for CUDA Minor Version Compatibility *** driver version is 535.161.07 sys.version : 3.9.18 (main, Sep 11 2023, 13:41:44) [GCC 11.2.0] torch version : 2.3.0.dev20231227 installed cuda version : 12.1 CUDA Compute Capability: 8.6 Microarchitecture Name: Ampere (3090, cuda >= 11.1, driver >=455.32) pytorch compiled for : ['sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90'] torch.cuda.is_available : True torch.backends.cudnn.enabled : True torch.cuda.get_device_properties(device) : _CudaDeviceProperties(name='NVIDIA GeForce RTX 3090', major=8, minor=6, total_memory=24237MB, multi_processor_count=82) SYSTEM CUDA_PATH: None LD_LIBRARY_PATH: /root/Workspace/hdl_loc/devel/lib:/root/Workspace/ws_livox/devel/lib:/opt/ros/noetic/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 torch.tensor([1.0, 2.0]).cuda() : tensor([1., 2.], device='cuda:0')