pytorch3d icon indicating copy to clipboard operation
pytorch3d copied to clipboard

Any Support for CUDA version (12.1)?

Open Larescool opened this issue 1 year ago • 11 comments

When setup pytorch3d-0.7.3, I met up with this:

The detected CUDA version (12.1) mismatches the version that was used to compile PyTorch (11.8). Please make sure to use the same CUDA versions.

Is there any solutions for newest CUDA version (12.1) ?

Larescool avatar Apr 22 '23 16:04 Larescool

Can you explain more about your set up? There is no release of pytorch with cuda 12.1. Did you build pytorch from source? If so, then if you build pytorch3d from source in the same environment then things should work.

bottler avatar Apr 23 '23 11:04 bottler

On Arch Linux, the latest packages are:

...Although, I personally use a non-system install of torch inside a virtual environment, which seems to work with my OS without any further changes. pip install torch seems to download the wheel torch-2.0.0-cp311-cp311-manylinux1_x86_64.whl, and everything seems to work without rebuilding specifically for CUDA 12.1. No error when running PyTorch code. Perhaps this is because torch installs the dependency nvidia-cuda-runtime-cu11. This suggests that the system CUDA runtime isn't being used here.

YodaEmbedding avatar May 08 '23 03:05 YodaEmbedding

Installing PyTorch 3D via git still leads to this error:

$ pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Building wheels for collected packages: pytorch3d
  Building wheel for pytorch3d (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [290 lines of output]
      /tmp/pip-req-build-ow6c0gf_/setup.py:84: UserWarning: The environment variable `CUB_HOME` was not found. NVIDIA CUB is required for compilation and can be downloaded from `https://github.com/NVIDIA/cub/releases`. You can unpack it to a location of your choice and set the environment variable `CUB_HOME` to the folder containing the `CMakeListst.txt` file.
        warnings.warn(
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-311
      creating build/lib.linux-x86_64-cpython-311/pytorch3d
      [...]
      copying pytorch3d/datasets/shapenet/shapenet_synset_dict_v1.json -> build/lib.linux-x86_64-cpython-311/pytorch3d/datasets/shapenet
      copying pytorch3d/datasets/r2n2/r2n2_synset_dict.json -> build/lib.linux-x86_64-cpython-311/pytorch3d/datasets/r2n2
      running build_ext
      Traceback (most recent call last):
        [...]
        File "/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 499, in build_extensions
          _check_cuda_version(compiler_name, compiler_version)
        File "/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 386, in _check_cuda_version
          raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
      RuntimeError:
      The detected CUDA version (12.1) mismatches the version that was used to compile
      PyTorch (11.7). Please make sure to use the same CUDA versions.

From what I can tell, this is because of PyTorch being happy with the non-system CUDA 11.7 runtime but unhappy with CUDA 12.1 being used for compilation. Thus, it's not really PyTorch 3D's fault. Possible workarounds:

  • Install CUDA 11.7 and set CUDA_HOME=/path/to/cuda-11.7.
  • Patch torch.utils.cpp_extension._check_cuda_version to ignore this error.
  • Build torch with CUDA 12.1.
  • Use the system site-packages version of PyTorch (which is presumably built with the system version of CUDA 12.1) instead of the virtual environment wheel version.
  • Use preexisting PyTorch 3D for CUDA 12.1 wheels.

YodaEmbedding avatar May 08 '23 07:05 YodaEmbedding

Using the system site-packages version of PyTorch 2.0.0 with CUDA 12.1, I get other errors when building PyTorch 3D from git:

$ pip install "git+https://github.com/facebookresearch/pytorch3d.git"
      [...]
      running build_ext
      /home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/utils/cpp_extension.py:398: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      building 'pytorch3d._C' extension
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query
      [...]
      Emitting ninja build file /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      [1/67] /opt/cuda/bin/nvcc  -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -I/tmp/pip-req-build-06jphomw/pytorch3d/csrc -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/TH -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/include -I/usr/include/python3.11 -c -c /tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.cu -o /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61
      FAILED: /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.o
      /opt/cuda/bin/nvcc  -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -I/tmp/pip-req-build-06jphomw/pytorch3d/csrc -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/TH -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/include -I/usr/include/python3.11 -c -c /tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.cu -o /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61
      /usr/include/stdlib.h(141): error: identifier "_Float32" is undefined
        extern _Float32 strtof32 (const char *__restrict __nptr,
               ^

      /usr/include/stdlib.h(147): error: identifier "_Float64" is undefined
        extern _Float64 strtof64 (const char *__restrict __nptr,
               ^

      /usr/include/stdlib.h(153): error: identifier "_Float128" is undefined
        extern _Float128 strtof128 (const char *__restrict __nptr,
               ^

That's a bunch of missing types, which suggests the warning may be relevant.

UserWarning: There are no g++ version bounds defined for CUDA version 12.1
$ g++ --version | head -n 1
g++ (GCC) 13.1.1 20230429

But the max g++ version for CUDA 12.0 is g++ 12.1. (Not sure about CUDA 12.1.) So presumably, a downgrade of g++ may help...

Luckily, I have an older Python 3.10 virtual environment with PyTorch 3D installed, so I might just use that instead of going further down the rabbit hole...

YodaEmbedding avatar May 08 '23 09:05 YodaEmbedding

I set different version of CUDA to tackle this problem.

Larescool avatar May 26 '23 01:05 Larescool

I encountered this issue when building PyTorch myself without conda on Arch Linux and saw the same error.

a downgrade of g++ may help...

As suggested by YodaEmbedding, I also think setting older versions of CC and CXX may fix this problem:

$ export CC=/usr/bin/gcc-11
$ export CXX=/usr/bin/g++-12
$ python setup.py build

I succeeded in building myself on Arch Linux with the above hack.

3tty0n avatar Jun 03 '23 16:06 3tty0n

Here's what I did to get things working on Arch Linux:

# Install PyTorch 1.13.1:
pip install --force-reinstall torch==1.13.1 torchvision==0.14.1

# Install CUDA 11.7:
paru -S cuda-11.7

# Install gcc10:
gpg --recv-keys 6C35B99309B5FA62  # expired keys from <2019 for gcc10
paru -S gcc10  # --chroot (optional, but may fix some issues)

# Download CUB:
(cd /tmp/ &&
  wget https://github.com/NVIDIA/cub/archive/refs/tags/2.1.0.tar.gz -O cub-2.1.0.tar.gz &&
  tar xf cub-2.1.0.tar.gz
)

export CUDA_HOME=/opt/cuda-11.7
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUB_HOME=/tmp/cub-2.1.0

pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Note that building (compiling+testing) gcc10 took around 10 hours on my i5 6500.

Also, I wrote the cuda-11.7 PKGBUILD based on the cuda-11.1 PKGBUILD from AUR, so there may (or may not) be issues with it.

YodaEmbedding avatar Jun 19 '23 00:06 YodaEmbedding

I had the same issue and found another solution.

When installing from a local clone, before running pip install -e ., go into the setup.py file in the pytorch3d dir and replace c++14 with c++17 in line 52 (extra_compile_args = {"cxx": ["-std=c++17"]}) and line 77 (nvcc_args.append("-std=c++17")).

Running pip will now compile everything using c++17. I tried a few functions and could not find any unwanted behavior, though ymmv.

DKatz96 avatar Aug 01 '23 00:08 DKatz96

I had the same issue and found another solution.

When installing from a local clone, before running pip install -e ., go into the setup.py file in the pytorch3d dir and replace c++14 with c++17 in line 52 (extra_compile_args = {"cxx": ["-std=c++17"]}) and line 77 (nvcc_args.append("-std=c++17")).

Running pip will now compile everything using c++17. I tried a few functions and could not find any unwanted behavior, though ymmv.

Thanks @DKatz96. I confirm that the parameter c++17 is now set by default in a local clone. Therefore working with the installation instructions given by pytorch3d. The following solved the problem in my case.

git clone https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d && pip install -e .

Install from a local clone

danielajisafe avatar Dec 19 '23 16:12 danielajisafe

This worked for me on ubuntu with cuda 12.1 everything:

pip install "git+https://github.com/facebookresearch/pytorch3d.git"

relh avatar Feb 22 '24 15:02 relh

You can try my repository for building packages and PyPI simple index and see if it works for you: https://github.com/facebookresearch/pytorch3d/discussions/1752

MiroPsota avatar Mar 07 '24 12:03 MiroPsota