PoinTr Inference example with pretrained model,CUDA kernel failed : no kernel image is available for execution on the device

When i settle down the environment required and run the sentence python tools/inference.py cfgs/PCN_models/AdaPoinTr.yaml ckpts/AdaPoinTr_PCN.pth --pc_root demo/ --save_vis_img --out_pc_root inference_result/

then it returns 2024-08-03 13:13:59,386 - MODEL - INFO - Transformer with config {'NAME': 'AdaPoinTr', 'num_query': 512, 'num_points': 16384, 'center_num': [512, 256], 'global_feature_dim': 1024, 'encoder_type': 'graph', 'decoder_type': 'fc', 'encoder_config': {'embed_dim': 384, 'depth': 6, 'num_heads': 6, 'k': 8, 'n_group': 2, 'mlp_ratio': 2.0, 'block_style_list': ['attn-graph', 'attn', 'attn', 'attn', 'attn', 'attn'], 'combine_style': 'concat'}, 'decoder_config': {'embed_dim': 384, 'depth': 8, 'num_heads': 6, 'k': 8, 'n_group': 2, 'mlp_ratio': 2.0, 'self_attn_block_style_list': ['attn-graph', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn'], 'self_attn_combine_style': 'concat', 'cross_attn_block_style_list': ['attn-graph', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn'], 'cross_attn_combine_style': 'concat'}} using group version 2 Loading weights from ckpts/AdaPoinTr_PCN.pth... ckpts @ 353 epoch( performance = {'F-Score': 0.8446799506656607, 'CDL1': 6.527985830404605, 'CDL2': 0.19307194130320907, 'EMDistance': 0.0}) CUDA kernel failed : no kernel image is available for execution on the device void furthest_point_sampling_kernel_wrapper(int, int, int, const float*, float*, int*) at L:228 in /home/mi/Pointnet2_PyTorch/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu

Aug 03 '24 05:08 Kitsch123456

i run the code on the rtx3060laptop ,gcc9, torch-1.7.1+cu110 torchaudio-0.7.2 torchvision-0.13.1+cu113

Aug 03 '24 05:08 Kitsch123456

I used the same command as you and got the same error. Here’s what I did.

Best Solution (here reference) :

go to file /Pointnet2_PyTorch/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu
comment out all the lines with CUDA_CHECK_ERRORS(); (there're 3 places)
run python3 setup.py install again in pointnet2_ops_lib folder

Second Solution (here reference) : (I've tried this but it did not work for me )

go to file /Pointnet2_PyTorch/pointnet2_ops_lib/setup.py
change the line os.environ["TORCH_CUDA_ARCH_LIST"] = "3.7+PTX;5.0;6.0;6.1;6.2;7.0;7.5" to os.environ["TORCH_CUDA_ARCH_LIST"] = "5.0;6.0;6.1;6.2;7.0;7.5;8.0;8.6;8.7;8.9;9.0" or just add your specific cuda arch code (see this list), in my case I use A100 so it's 8.0
run python3 setup.py install again in pointnet2_ops_lib folder

The problem come from the pointnet2_ops library, as shown in your output here:

void furthest_point_sampling_kernel_wrapper(int, int, int, const float*, float*, int*) at L:228 in /home/mi/Pointnet2_PyTorch/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu

Since there has been no maintenance of the Pointnet2_PyTorch library since July 31, 2021, the contributor mentioned it here

Aug 17 '24 17:08 Nineyoyoyo

Hello friend, I also encounter this issue when I'm about to create a dockerfile containing adapointr. At the end, I noticed that the problem is as @Nineyoyoyo said in the "Second Solution". The "TORCH_CUDA_ARCH_LIST" is a sign indicated the nvidia GPU compatibility when you compile the CUDA kernel image.

The Pointnet2_PyTorch is no maintenance since July 31, 2021, and in the Pointnet2_PyTorch/pointnet2_ops_lib/setup.py, the TORCH_CUDA_ARCH_LIST is set to "3.7+PTX;5.0;6.0;6.1;6.2;7.0;7.5", which is incompatible with your rtx3060laptop's 8.6. So even though you can build the image kernel, you can not run it using your 3060!

So the solution is as @Nineyoyoyo said, change the value to "5.0;6.0;6.1;6.2;7.0;7.5;8.0;8.6;8.7;8.9;9.0" or even "8.0;8.6;8.7;8.9;9.0".

My platform is 4070tis, and I'm currently running this well after this change.

Sep 28 '24 07:09 Yiju1213

Hi, I got this same error originally and tried the fixes mentioned above but it didn't work. Instead, the error is now at a new line instead of furthest point sampling:

pointnetpp/pointnet2_utils.py", line 104, in forward
    return _ext.gather_points(features, idx)
RuntimeError: CUDA error: no kernel image is available for execution on the device

Pointnet builds and imports without any errors but does not execute. I'm on CUDA 12.3, Python3.9.2, torch 2.1.2+cu121

Nov 21 '24 18:11 HaochenZ11