torchsort icon indicating copy to clipboard operation
torchsort copied to clipboard

Pre-built binaries CUDA extention unavailable

Open teddykoker opened this issue 7 months ago • 2 comments

Originally posted by @LouisJalouzot in https://github.com/teddykoker/torchsort/issues/90#issuecomment-2878093698

I have been trying around some binaries but unfortunately even when matching exactly the versions of Python, PyTorch and CUDA, I get the following error:

ImportError: You are trying to use the torchsort CUDA extension, but it looks like it is not available. Make sure you have the CUDA toolchain installed, and reinstall torchsort with `pip install --force-reinstall --no-cache-dir torchsort` to rebuild the extension.

I tried for instance with a machine running on Rocky Linux release 9.5 (Blue Onyx) with CUDA 12.4 with the following environment:

Using Python 3.12.10 environment at: .test
Package                  Version
------------------------ ---------------
filelock                 3.18.0
fsspec                   2025.3.2
jinja2                   3.1.6
markupsafe               3.0.2
mpmath                   1.3.0
networkx                 3.4.2
nvidia-cublas-cu12       12.4.5.8
nvidia-cuda-cupti-cu12   12.4.127
nvidia-cuda-nvrtc-cu12   12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12        9.1.0.70
nvidia-cufft-cu12        11.2.1.3
nvidia-curand-cu12       10.3.5.147
nvidia-cusolver-cu12     11.6.1.9
nvidia-cusparse-cu12     12.3.1.170
nvidia-cusparselt-cu12   0.6.2
nvidia-nccl-cu12         2.21.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.4.127
setuptools               80.4.0
sympy                    1.13.1
torch                    2.6.0+cu124
torchsort                0.1.9+pt26cu124
triton                   3.2.0
typing-extensions        4.13.2

teddykoker avatar May 14 '25 17:05 teddykoker

Reproduced this locally with Ubuntu 22.04, CUDA 12.4, Python 3.10, PyTorch 2.6.0+cu124. Not sure why this is the case as all of the packages seemed to build without errors. Will investigate further, but might not have the bandwidth for a little while @LouisJalouzot.

teddykoker avatar May 14 '25 17:05 teddykoker

I had to debug this a bit myself, and it seems like the pre-built packages lack the compiled cuda module. Re-building myself ("python setup.py install") creates a corresponding "isotonic_cuda.cpython-312-x86_64-linux-gnu.so", whereas the prebuilt ones only have the cpu binary:

$ unzip torchsort-0.1.9+pt26cu124-cp312-cp312-linux_x86_64.whl 
Archive:  torchsort-0.1.9+pt26cu124-cp312-cp312-linux_x86_64.whl
  inflating: torchsort/__init__.py   
  inflating: torchsort/isotonic_cpu.cpp  
  inflating: torchsort/isotonic_cpu.cpython-312-x86_64-linux-gnu.so  
  inflating: torchsort/isotonic_cuda.cu  
  inflating: torchsort/ops.py        
  inflating: torchsort-0.1.9+pt26cu124.dist-info/licenses/LICENSE  
  inflating: torchsort-0.1.9+pt26cu124.dist-info/METADATA  
  inflating: torchsort-0.1.9+pt26cu124.dist-info/WHEEL  
  inflating: torchsort-0.1.9+pt26cu124.dist-info/top_level.txt  
  inflating: torchsort-0.1.9+pt26cu124.dist-info/RECORD

Random thought, but setup.py only compiles the cuda binaries if nvcc is on the path, and by default cuda installations don't add it to the path...

argusdusty avatar May 22 '25 03:05 argusdusty

@LouisJalouzot apologies for the delay! The pre-built cuda binaries should be working now for the latest release v0.1.10. Thanks @argusdusty, nvcc had to be added to the path before build in order to trigger the cuda compilation.

teddykoker avatar Jun 09 '25 16:06 teddykoker

Wonderful, thanks a lot @teddykoker! (torchsort-0.1.10+pt26cu126-cp312-cp312-linux_x86_64.whl works on my side)

LouisJalouzot avatar Jun 10 '25 11:06 LouisJalouzot