torch-points-kernels icon indicating copy to clipboard operation
torch-points-kernels copied to clipboard

"CUDA error: No kernel image" still exists after reinstalling torch-points-kernels

Open maosuli opened this issue 2 years ago • 0 comments

Hi,

I have to compile the "torch-points-kernels" library in my workstation and then run the code in a remote server using the same conda environment.

The "CUDA error" happened after I submitted the job to the remote server although I could run the code well in my workstation.

Following your solution, I uninstalled the library, cleared the cache, and reinstalled it on my workstation after setting the TORCH_CUDA_ARCH_LIST.

But the same error still happened.

I checked the two GPU cards, which were Quadro RTX 6000 (Turing SM 75) and Tesla V100 (Volta SM70), respectively. And I set 'export TORCH_CUDA_ARCH_LIST="7.0;7.5"' before I reinstalled the library.

The error details are as follows,

Traceback (most recent call last): File "train_s_stransformer.py", line 613, in main() File "train_s_stransformer.py", line 92, in main main_worker(args.train_gpu, args.ngpus_per_node, args) File "train_s_stransformer.py", line 327, in main_worker loss_train, mIoU_train, mAcc_train, allAcc_train= train(train_loader, model, criterion, optimizer, epoch, scaler, scheduler) File "train_s_stransformer.py", line 426, in train output = model(feat, coord, offset, batch, neighbor_idx) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/xxx/3dSegmentation/stratified_transformer/Stratified-Transformer-main/model/stratified_transformer.py", line 453, in forward feats, xyz, offset, feats_down, xyz_down, offset_down = layer(feats, xyz, offset) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/xxx/3dSegmentation/stratified_transformer/Stratified-Transformer-main/model/stratified_transformer.py", line 281, in forward v2p_map, p2v_map, counts = grid_sample(xyz, batch, window_size, start=None) File "/home/xxx/3dSegmentation/stratified_transformer/Stratified-Transformer-main/model/stratified_transformer.py", line 59, in grid_sample unique, cluster, counts = torch.unique(cluster, sorted=True, return_inverse=True, return_counts=True) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/_jit_internal.py", line 421, in fn return if_true(*args, **kwargs) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/_jit_internal.py", line 421, in fn return if_true(*args, **kwargs) File "/home/xxx/.conda/envs/s_transformer10/lib/python3.7/site-packages/torch/functional.py", line 769, in _unique_impl return_counts=return_counts, RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Please give me some advice on how to use it.

Best,

Eric.

maosuli avatar Mar 26 '23 07:03 maosuli