tiny-cuda-nn
tiny-cuda-nn copied to clipboard
[Solved] Failed to build tinycudann; Could not build wheels for tinycudann; Could not find filesystem; xxx.so.xx no such file or directory
My problems have been solved!
If you met similar problems as below, maybe my experience could help you out~
Problem-Cause-Solution
Basic problem: Failed to build tinycudann
and Could not build wheels for tinycudann
...
Q1:
... fatal error: filesystem: 没有那个文件或目录
(no such file or directory)
- Check you gcc version using command
gcc -v
orgcc --version
. These problems arise due to low gcc version (<8). Here I referred to the CSDN blog about filesystem&gcc and stackoverflow. - Solution: Upgrade gcc (>=8). e.g. For me, gcc=8.5.0
Q2: After upgrading gcc, I met error like this:
/mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus: error while loading shared libraries: libmpfr.so.1: cannot open shared object file: No such file or directory
- Check by using command
ldd <path/to/cc1plus>
, where you should replace<path/to/ccqplus>
by the correct path in your error information. e.g. For the example above, it should beldd /mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus
. - You would probably find a line like this:
libmpfr.so.1 => not found
. Check the output carefully and you may find the other line in the form oflibmpfr.so.1 => <a/thorough/path>
. Copy the correct path and check it if neccessary using commandscd <that/paht>
and 'ls' (Remember to 'cd' back afterwards!). - Add that path(s) to the experimental variable
LD_LIBRARY_PATH
:export LD_LIBRARY_PATH=<path>:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
. (I use cluster, so if you are not, just add it/them to your environmental variables.)
The Original Description (modified a little)
Background
I just wanted to try threestudio, so I followed its directions and installed pytorch before install the requirements. In its requirements.txt is "git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch", but I failed at this step.
I had to install tiny-cuda-nn in the cluster. By setting the environmental variables (export xxx=xxx), I made cuda=11.3 and gcc=7.5.0. These settings used to prove successful but I don't know why it should fail after I reinstalled conda and reset the environments... Or maybe it is because I killed a terminal by closing the VS code (at that time I successfully built the PyTorch extension for tiny-cuda-nn)?
I have no idea but to delete the folder, clone it again, reinstall conda and reset all the environments... But I just got errors again and again! Could someone help me out? Pleeeease!
Besides, I used srun
to run the process:
from pip._internal import main
# main(['install', '-r', 'requirements.txt'])
main(['install', 'git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch'])
In the terminal (I hided the partition):
srun -p xxx --gres=gpu:0 --ntasks-per-node=1 python tmp.py
My settings:
python=3.9.16 cuda=11.3 gcc=7.5.0 torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 pip=23.1.2 setuptools=67.8.0 wheel=0.38.4
Important Outputs (I think)
- /tmp/pip-req-build-f8bpzkp0/dependencies/json/json.hpp:3954:14: fatal error: filesystem: 没有那个文件或目录
- error: command '/mnt/petrelfs/share/cuda-11.3/bin/nvcc' failed with exit code 1
- note: This error originates from a subprocess, and is likely not a problem with pip.
- ERROR: Failed building wheel for tinycudann
- Failed to build tinycudann
- ERROR: Could not build wheels for tinycudann, which is required to install pyproject.toml-based projects
What I've tried
- Upgrade setuptools & wheel: It didn't work
- Change gcc version: neither did it work to change to higher or lower version I could access. The error caused by "filesystem" disappeared but another error was generated... (If you are curious about that, I'd love to retry and show the detailed results)
- Change to cuda=11.8, torch=2.0.1 and so on: failed (I didn't remembered exactly why)
- Rebuild the environments and even reinstall conda: I at first created several environment using conda and virtualenv, but I deleted all after meeting the "Segmentation fault" when running the lauch.py of threestudio (that would be another complicated case for me!). I expected to smoothly build a new environment in an original state, but obviously I failed!!!
- Maybe other methods? But I've forgotten.
Below is the whole info (except that of srun):**(I think I've hided all my information..)
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Collecting git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
Cloning https://github.com/NVlabs/tiny-cuda-nn/ to /tmp/pip-req-build-f8bpzkp0
Running command git clone --quiet https://github.com/NVlabs/tiny-cuda-nn/ /tmp/pip-req-build-f8bpzkp0
Resolved https://github.com/NVlabs/tiny-cuda-nn/ to commit <some sort of 123 and abc, I don't know whether it's related to my information or not so I just hided them>
Running command git submodule update --init --recursive -q
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: tinycudann
Building wheel for tinycudann (setup.py): started
Building wheel for tinycudann (setup.py): finished with status 'error'
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [47 lines of output]
Building PyTorch extension for tiny-cuda-nn version 1.7
Obtained compute capability 80 from PyTorch
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0
Detected CUDA version 11.3
Targeting C++ standard 17
running bdist_wheel
/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/utils/cpp_extension.py:411: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-39
creating build/lib.linux-x86_64-cpython-39/tinycudann
copying tinycudann/__init__.py -> build/lib.linux-x86_64-cpython-39/tinycudann
copying tinycudann/modules.py -> build/lib.linux-x86_64-cpython-39/tinycudann
running egg_info
creating tinycudann.egg-info
writing tinycudann.egg-info/PKG-INFO
writing dependency_links to tinycudann.egg-info/dependency_links.txt
writing top-level names to tinycudann.egg-info/top_level.txt
writing manifest file 'tinycudann.egg-info/SOURCES.txt'
reading manifest file 'tinycudann.egg-info/SOURCES.txt'
writing manifest file 'tinycudann.egg-info/SOURCES.txt'
copying tinycudann/bindings.cpp -> build/lib.linux-x86_64-cpython-39/tinycudann
running build_ext
building 'tinycudann_bindings._80_C' extension
creating dependencies
creating dependencies/fmt
creating dependencies/fmt/src
creating src
creating build/temp.linux-x86_64-cpython-39
creating build/temp.linux-x86_64-cpython-39/tinycudann
gcc -pthread -B /mnt/petrelfs/xxx/anaconda3/envs/studio/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -I/tmp/pip-req-build-f8bpzkp0/include -I/tmp/pip-req-build-f8bpzkp0/dependencies -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/fmt/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/TH -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/THC -I/mnt/petrelfs/share/cuda-11.3/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include/python3.9 -c ../../dependencies/fmt/src/format.cc -o build/temp.linux-x86_64-cpython-39/../../dependencies/fmt/src/format.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
gcc -pthread -B /mnt/petrelfs/xxx/anaconda3/envs/studio/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -O2 -isystem /mnt/petrelfs/xxx/anaconda3/envs/studio/include -fPIC -I/tmp/pip-req-build-f8bpzkp0/include -I/tmp/pip-req-build-f8bpzkp0/dependencies -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/fmt/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/TH -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/THC -I/mnt/petrelfs/share/cuda-11.3/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include/python3.9 -c ../../dependencies/fmt/src/os.cc -o build/temp.linux-x86_64-cpython-39/../../dependencies/fmt/src/os.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
/mnt/petrelfs/share/cuda-11.3/bin/nvcc -I/tmp/pip-req-build-f8bpzkp0/include -I/tmp/pip-req-build-f8bpzkp0/dependencies -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-f8bpzkp0/dependencies/fmt/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/TH -I/mnt/petrelfs/xxx/anaconda3/envs/studio/lib/python3.9/site-packages/torch/include/THC -I/mnt/petrelfs/share/cuda-11.3/include -I/mnt/petrelfs/xxx/anaconda3/envs/studio/include/python3.9 -c ../../src/common_host.cu -o build/temp.linux-x86_64-cpython-39/../../src/common_host.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=80 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=_80_C -D_GLIBCXX_USE_CXX11_ABI=0
In file included from /tmp/pip-req-build-f8bpzkp0/include/tiny-cuda-nn/cpp_api.h:32:0,
from /tmp/pip-req-build-f8bpzkp0/include/tiny-cuda-nn/common_host.h:33,
from ../../src/common_host.cu:31:
/tmp/pip-req-build-f8bpzkp0/dependencies/json/json.hpp:3954:14: fatal error: filesystem: 没有那个文件或目录
#include <filesystem>
^~~~~~~~~~~~
compilation terminated.
error: command '/mnt/petrelfs/share/cuda-11.3/bin/nvcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tinycudann
Running setup.py clean for tinycudann
Failed to build tinycudann
ERROR: Could not build wheels for tinycudann, which is required to install pyproject.toml-based projects
Cheers!!! At least now I've temporarily solved the problem.
Background & Solution
- I changed
cuda=11.8, gcc=8.5.0
and upgraded torch, torchaudio and torchvison usingpip install --upgrade
. So:torch=2.0.1, torchaudio=2.0.2, torchvision=0.15.2
(I'm not quite sure whether it's one of the key steps but I do change the versions, and this may have some impact on the afterward steps) - Then I ran the process of pip install ... This time no "filesystem" error, but I met a new error that worths my attention:
/mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus: error while loading shared libraries: libmpfr.so.1: cannot open shared object file: No such file or directory
- To solved it, I referred to this website and added Environmental Variables "LD_LIBRARY_PATH" according to new errors
- pip install ..., a new but similar error, so similarly added new path and ran again. And this time I successfully installed tiny-cuda-nn!
- My settings of Environmental Variables:
export EXTRA_LIB_HOME=/mnt/petrelfs/share/gcc/mpc-0.8.1/lib:/mnt/lustre/share/gcc/mpfr-2.4.2/lib:/mnt/lustre/share/gcc/gmp-4.3.2/lib
export CUDA_HOME=/mnt/petrelfs/share/cuda-11.8
export GCC_HOME=/mnt/petrelfs/share/gcc/gcc-8.5.0
export LD_LIBRARY_PATH=${EXTRA_LIB_HOME}:${GCC_HOME}/lib64:${CUDA_HOME}/lib64:${CUDA_HOME}/extras/CUPTI/lib64
Here EXTRA_LIB_HOME
is created just to make it easy for me to understand the structure of LD_LIBRARY_PATH
.
Attention
Here CUDA_HOME
and GCC_HOME
respectively determine the versions of cuda and gcc. LD_LIBRARY_PATH
shows the paths of the libraries.
The EXTRA_LIB_HOME
is created by myself, which completed the missing links of some sort of file -- I referred to this website.
About whether to set EXTRA_LIB_HOME
or not and how to set it, see if you could find the error similar to this:
/mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus: error while loading shared libraries: libmpfr.so.6: cannot open shared object file: No such file or directory
If you do met the similar problem but cannot figure out what that website says, you can read stuffs below
According to the website I mentioned above, just use command ldd
to list the links:
ldd /mnt/petrelfs/share/gcc/gcc-8.5.0/libexec/gcc/x86_64-pc-linux-gnu/8.5.0/cc1plus
Tthen you may find some xxx.so => not found
(for example, libmpfr.so.1
)
As for me, I found two xxx.so in the list, one points to not found
while the other points to an exact file path like /mnt/petrelfs/share/gcc/mpc-0.8.1/lib/xxx.so
(cd /mnt/petrelfs/share/gcc/mpc-0.8.1/lib/
and ls
then you could find the file xxx.so
)
Forgive me for my looooong comment, because for a novice like me it's realy sad to read a solution which is quite simple and effective but puzzles me... T^T
Hi There, Your main issue is that you are trying to install this on a cluster instead of a personal workstation. You have two options.
- Create a singularity image (via pulling a docker image of tiny-cuda-nn, if a docker image exist). If you are unfamiliar with developing docker/singularity recipes, this could be a steep learning curve.
- Please follow instructions for your HPC cluster to load/swap modules. That is why you get shared libraries not found. When you load the correct modules, the corresponding paths will be automatically added to your profile. You have to do this everytime you log in to your compute nodes.
Perhaps, you might want to create your conda environment and installation in interactive mode (using salloc). This will help you better to identify where your installation problems are.
Hope this helps. Ash.
Hi There, Your main issue is that you are trying to install this on a cluster instead of a personal workstation. You have two options.
- Create a singularity image (via pulling a docker image of tiny-cuda-nn, if a docker image exist). If you are unfamiliar with developing docker/singularity recipes, this could be a steep learning curve.
- Please follow instructions for your HPC cluster to load/swap modules. That is why you get shared libraries not found. When you load the correct modules, the corresponding paths will be automatically added to your profile. You have to do this everytime you log in to your compute nodes.
Perhaps, you might want to create your conda environment and installation in interactive mode (using salloc). This will help you better to identify where your installation problems are.
Hope this helps. Ash.
Thank you very much for your thorough answer!!! I've solved the problem and learned a lot from your comment!!! $\hat{0} _{\ \nabla} \ \hat{0}$
Thank you very much for sharing! This method successfully solved my problem.
I found that directly using these commands can also work: export LD_LIBRARY_PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/bin:$PATH
I found that directly using these commands can also work: export LD_LIBRARY_PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/bin:$PATH
Congratulations! 🎉 For me, I've already added these paths before but still met those terrifying problems. So I had to try other methods to solve them. 😜
I found that directly using these commands can also work: export LD_LIBRARY_PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export PATH=/mnt/petrelfs/share/gcc/gcc-8.5.0/bin:$PATH
Oh, the command export
can only temporarily take effect in only one bash (terminal). Try add it to the end of the file .bashrc
so you wouldn't be troubled every time you open a new bash.
Thank you so much! When I changed to a different version, I encountered the above problem again and applied your method to solve it.
I encounter the same error about No such file or directory
, e.g.
/data/zhangyupeng/w/tiny-cuda-nn/dependencies/json/json.hpp:3954:14: fatal error: filesystem: No such file or directory
My Solution: upgrade g++, ensure g++ / gcc / c++ all points 9 version. I think versiono 8 or above should be fine according README.md of this project.