Pytorch-Correlation-extension
Pytorch-Correlation-extension copied to clipboard
IndexError: list index out of range
Hi, I am building a Dockerfile for an app which has as a dependency the Pytorch correlation extension. I get this error either building it manually or with pip. The only thing that changes is when I use a different CUDA version, with CUDA 12 I get this but with CUDA 10 I get the No such file or directory: '/usr/local/cuda/bin/nvcc'
error.
This is the Dockerfile:
# Use an NVIDIA CUDA base image
#FROM nvcr.io/nvidia/cuda:10.2-cudnn7-runtime-ubi8
FROM nvidia/cuda:12.1.0-devel-ubi8
# Install essential utilities
RUN dnf install -y \
git \
wget \
curl \
cmake \
gcc-c++ \
python3-devel\
python3-pip \
protobuf-compiler\
&& dnf clean all
#&& rm -rf /var/lib/apt/lists/*
# Set the working directory in the container
WORKDIR /workdir
# Download and install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /miniconda.sh \
&& bash /miniconda.sh -b -p /miniconda \
&& rm /miniconda.sh
# Add Miniconda to PATH
ENV PATH="/miniconda/bin:${PATH}"
# Create Conda environment and install ONNX
RUN conda create --name=onnx_py37 python=3.7.2 protobuf=3.6.1 pybind11=2.2 numpy scipy pytest \
&& echo "source activate onnx_py37" > ~/.bashrc \
&& source ~/.bashrc \
&& conda install -c conda-forge onnx
# Install PyTorch and TorchVision
RUN pip install torch torchvision
# Upgrade pip to the latest version
RUN python3 -m pip install --upgrade pip
# Install CUDA toolkit and cuDNN (modify versions if needed)
#RUN dnf install -y cuda-toolkit-10-2 libcudnn8-devel-8.0.4.30-1.cuda10.2 \
# && dnf clean all
# Set CUDA_HOME environment variable
RUN export CUDA_HOME="/usr/local/cuda"
# Install SAM and its dependencies
RUN pip install git+https://github.com/facebookresearch/segment-anything.git
RUN pip install Cython
RUN pip install opencv-python-headless Pillow pycocotools matplotlib
#onnxruntime onnx
# Install DeAOT
RUN git clone https://github.com/yoxu515/aot-benchmark.git
# Install Pytorch Correlation from PyPI
#RUN git clone https://github.com/ClementPinard/Pytorch-Correlation-extension.git
#WORKDIR /workdir/Pytorch-Correlation-extension
#RUN python3 setup.py install
#WORKDIR /workdir
RUN pip install spatial-correlation-sampler
# Rest of the Dockerfile
Complete log of the error:
=> ERROR [13/20] RUN pip install spatial-correlation-sampler 28.4s
------
> [13/20] RUN pip install spatial-correlation-sampler:
0.844 Collecting spatial-correlation-sampler
1.010 Downloading spatial_correlation_sampler-0.4.0.tar.gz (9.3 kB)
1.027 Preparing metadata (setup.py): started
3.363 Preparing metadata (setup.py): finished with status 'done'
3.373 Requirement already satisfied: torch>=1.1 in /miniconda/lib/python3.11/site-packages (from spatial-correlation-sampler) (2.1.2)
3.383 Requirement already satisfied: numpy in /miniconda/lib/python3.11/site-packages (from spatial-correlation-sampler) (1.26.2)
3.429 Requirement already satisfied: filelock in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (3.13.1)
3.430 Requirement already satisfied: typing-extensions in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (4.9.0)
3.431 Requirement already satisfied: sympy in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (1.12)
3.433 Requirement already satisfied: networkx in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (3.2.1)
3.434 Requirement already satisfied: jinja2 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (3.1.2)
3.435 Requirement already satisfied: fsspec in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (2023.12.2)
3.438 Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.440 Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.443 Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.445 Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (8.9.2.26)
3.447 Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.3.1)
3.449 Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (11.0.2.54)
3.452 Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (10.3.2.106)
3.454 Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (11.4.5.107)
3.457 Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.0.106)
3.459 Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (2.18.1)
3.461 Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.463 Requirement already satisfied: triton==2.1.0 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (2.1.0)
3.478 Requirement already satisfied: nvidia-nvjitlink-cu12 in /miniconda/lib/python3.11/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.1->spatial-correlation-sampler) (12.3.101)
3.575 Requirement already satisfied: MarkupSafe>=2.0 in /miniconda/lib/python3.11/site-packages (from jinja2->torch>=1.1->spatial-correlation-sampler) (2.1.3)
3.605 Requirement already satisfied: mpmath>=0.19 in /miniconda/lib/python3.11/site-packages (from sympy->torch>=1.1->spatial-correlation-sampler) (1.3.0)
3.624 Building wheels for collected packages: spatial-correlation-sampler
3.625 Building wheel for spatial-correlation-sampler (setup.py): started
25.94 Building wheel for spatial-correlation-sampler (setup.py): finished with status 'error'
25.96 error: subprocess-exited-with-error
25.96
25.96 × python setup.py bdist_wheel did not run successfully.
25.96 │ exit code: 1
25.96 ╰─> [84 lines of output]
25.96 No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
25.96 running bdist_wheel
25.96 /miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py:502: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
25.96 warnings.warn(msg.format('we could not find ninja.'))
25.96 running build
25.96 running build_py
25.96 creating build
25.96 creating build/lib.linux-x86_64-cpython-311
25.96 creating build/lib.linux-x86_64-cpython-311/spatial_correlation_sampler
25.96 copying Correlation_Module/spatial_correlation_sampler/spatial_correlation_sampler.py -> build/lib.linux-x86_64-cpython-311/spatial_correlation_sampler
25.96 copying Correlation_Module/spatial_correlation_sampler/__init__.py -> build/lib.linux-x86_64-cpython-311/spatial_correlation_sampler
25.96 running build_ext
25.96 /miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
25.96 warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
25.96 building 'spatial_correlation_sampler_backend' extension
25.96 creating build/temp.linux-x86_64-cpython-311
25.96 creating build/temp.linux-x86_64-cpython-311/Correlation_Module
25.96 gcc -pthread -B /miniconda/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /miniconda/include -fPIC -O2 -isystem /miniconda/include -fPIC -DUSE_CUDA -I/miniconda/lib/python3.11/site-packages/torch/include -I/miniconda/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/miniconda/lib/python3.11/site-packages/torch/include/TH -I/miniconda/lib/python3.11/site-packages/torch/include/THC -I/usr/local/cuda/include -I/miniconda/include/python3.11 -c Correlation_Module/correlation.cpp -o build/temp.linux-x86_64-cpython-311/Correlation_Module/correlation.o -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=spatial_correlation_sampler_backend -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
25.96 Traceback (most recent call last):
25.96 File "<string>", line 2, in <module>
25.96 File "<pip-setuptools-caller>", line 34, in <module>
25.96 File "/tmp/pip-install-vzxlhio8/spatial-correlation-sampler_c804c4393d8c491db88e69cf20b5410e/setup.py", line 57, in <module>
25.96 launch_setup()
25.96 File "/tmp/pip-install-vzxlhio8/spatial-correlation-sampler_c804c4393d8c491db88e69cf20b5410e/setup.py", line 25, in launch_setup
25.96 setup(
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/__init__.py", line 103, in setup
25.96 return distutils.core.setup(**attrs)
25.96 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
25.96 return run_commands(dist)
25.96 ^^^^^^^^^^^^^^^^^^
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
25.96 dist.run_commands()
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
25.96 self.run_command(cmd)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
25.96 super().run_command(command)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
25.96 cmd_obj.run()
25.96 File "/miniconda/lib/python3.11/site-packages/wheel/bdist_wheel.py", line 364, in run
25.96 self.run_command("build")
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
25.96 self.distribution.run_command(command)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
25.96 super().run_command(command)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
25.96 cmd_obj.run()
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 131, in run
25.96 self.run_command(cmd_name)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
25.96 self.distribution.run_command(command)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
25.96 super().run_command(command)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
25.96 cmd_obj.run()
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 88, in run
25.96 _build_ext.run(self)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
25.96 self.build_extensions()
25.96 File "/miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
25.96 build_ext.build_extensions(self)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
25.96 self._build_extensions_serial()
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
25.96 self.build_extension(ext)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 249, in build_extension
25.96 _build_ext.build_extension(self, ext)
25.96 File "/miniconda/lib/python3.11/site-packages/Cython/Distutils/build_ext.py", line 135, in build_extension
25.96 super(build_ext, self).build_extension(ext)
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
25.96 objects = self.compiler.compile(
25.96 ^^^^^^^^^^^^^^^^^^^^^^
25.96 File "/miniconda/lib/python3.11/site-packages/setuptools/_distutils/ccompiler.py", line 600, in compile
25.96 self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
25.96 File "/miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 609, in unix_wrap_single_compile
25.96 cflags = unix_cuda_flags(cflags)
25.96 ^^^^^^^^^^^^^^^^^^^^^^^
25.96 File "/miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 576, in unix_cuda_flags
25.96 cflags + _get_cuda_arch_flags(cflags))
25.96 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25.96 File "/miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1980, in _get_cuda_arch_flags
25.96 arch_list[-1] += '+PTX'
25.96 ~~~~~~~~~^^^^
25.96 IndexError: list index out of range
25.96 [end of output]
25.96
25.96 note: This error originates from a subprocess, and is likely not a problem with pip.
25.96 ERROR: Failed building wheel for spatial-correlation-sampler
25.96 Running setup.py clean for spatial-correlation-sampler
28.23 Failed to build spatial-correlation-sampler
28.23 ERROR: Could not build wheels for spatial-correlation-sampler, which is required to install pyproject.toml-based projects
------
Dockerfile:62
--------------------
60 | #RUN python3 setup.py install
61 | #WORKDIR /workdir
62 | >>> RUN pip install spatial-correlation-sampler
63 |
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install spatial-correlation-sampler" did not complete successfully: exit code: 1
Hello, long story short, you don't have access to CUDA when compiling.
See more info about it here : https://github.com/pytorch/extension-cpp/issues/71#issuecomment-1137310884
Thanks @ClementPinard, I will look at that and try to solve it.
Hi @ClementPinard, I changed the daemon.json
from /etc/docker/
but now I came back with an error I had before:
=> ERROR [14/21] RUN pip install spatial-correlation-sampler 8.3s
------
> [14/21] RUN pip install spatial-correlation-sampler:
0.905 Collecting spatial-correlation-sampler
0.982 Downloading spatial_correlation_sampler-0.4.0.tar.gz (9.3 kB)
0.998 Preparing metadata (setup.py): started
3.318 Preparing metadata (setup.py): finished with status 'done'
3.328 Requirement already satisfied: torch>=1.1 in /miniconda/lib/python3.11/site-packages (from spatial-correlation-sampler) (2.1.0)
3.338 Requirement already satisfied: numpy in /miniconda/lib/python3.11/site-packages (from spatial-correlation-sampler) (1.26.2)
3.384 Requirement already satisfied: filelock in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (3.13.1)
3.385 Requirement already satisfied: typing-extensions in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (4.9.0)
3.387 Requirement already satisfied: sympy in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (1.12)
3.388 Requirement already satisfied: networkx in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (3.2.1)
3.389 Requirement already satisfied: jinja2 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (3.1.2)
3.391 Requirement already satisfied: fsspec in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (2023.12.2)
3.393 Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.395 Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.398 Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.400 Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (8.9.2.26)
3.402 Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.3.1)
3.405 Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (11.0.2.54)
3.407 Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (10.3.2.106)
3.410 Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (11.4.5.107)
3.412 Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.0.106)
3.414 Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (2.18.1)
3.417 Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (12.1.105)
3.419 Requirement already satisfied: triton==2.1.0 in /miniconda/lib/python3.11/site-packages (from torch>=1.1->spatial-correlation-sampler) (2.1.0)
3.434 Requirement already satisfied: nvidia-nvjitlink-cu12 in /miniconda/lib/python3.11/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.1->spatial-correlation-sampler) (12.3.101)
3.525 Requirement already satisfied: MarkupSafe>=2.0 in /miniconda/lib/python3.11/site-packages (from jinja2->torch>=1.1->spatial-correlation-sampler) (2.1.3)
3.554 Requirement already satisfied: mpmath>=0.19 in /miniconda/lib/python3.11/site-packages (from sympy->torch>=1.1->spatial-correlation-sampler) (1.3.0)
3.573 Building wheels for collected packages: spatial-correlation-sampler
3.574 Building wheel for spatial-correlation-sampler (setup.py): started
5.875 Building wheel for spatial-correlation-sampler (setup.py): finished with status 'error'
5.887 error: subprocess-exited-with-error
5.887
5.887 × python setup.py bdist_wheel did not run successfully.
5.887 │ exit code: 1
5.887 ╰─> [13 lines of output]
5.887 No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
5.887 running bdist_wheel
5.887 /miniconda/lib/python3.11/site-packages/torch/utils/cpp_extension.py:502: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
5.887 warnings.warn(msg.format('we could not find ninja.'))
5.887 running build
5.887 running build_py
5.887 creating build
5.887 creating build/lib.linux-x86_64-cpython-311
5.887 creating build/lib.linux-x86_64-cpython-311/spatial_correlation_sampler
5.887 copying Correlation_Module/spatial_correlation_sampler/spatial_correlation_sampler.py -> build/lib.linux-x86_64-cpython-311/spatial_correlation_sampler
5.887 copying Correlation_Module/spatial_correlation_sampler/__init__.py -> build/lib.linux-x86_64-cpython-311/spatial_correlation_sampler
5.887 running build_ext
5.887 error: [Errno 2] No such file or directory: '/usr/local/cuda/bin/nvcc'
5.887 [end of output]
5.887
5.887 note: This error originates from a subprocess, and is likely not a problem with pip.
5.887 ERROR: Failed building wheel for spatial-correlation-sampler
5.887 Running setup.py clean for spatial-correlation-sampler
8.188 Failed to build spatial-correlation-sampler
8.189 ERROR: Could not build wheels for spatial-correlation-sampler, which is required to install pyproject.toml-based projects
------
Dockerfile:71
--------------------
69 |
71 | >>> RUN pip install spatial-correlation-sampler
72 |
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install spatial-correlation-sampler" did not complete successfully: exit code: 1
It's weird because it tells it does not have access to the nvss program (the cuda compiler) , but you're using a devel docker image, which should feature nvcc
See more info for your problem : https://github.com/NVIDIA/nvidia-docker/issues/1160
Try to call nvcc from outside the pip command to see if nvcc is indeed reachable.
Hi @ClementPinard, I finally solved it by adding these lines:
ENV PATH=${CUDA_HOME}/bin:${PATH}
ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
Thanks a lot for the help.