[BUG] failed to build fVDB with cuda-12.2
Environment
Operating System: Linux NixOS Version / Commit SHA: fVDB Other: gcc 10.5.0
Describe the bug
I'm trying to build fVDB with CUDA 12.2, but the build fails with the following error: error: no suitable conversion function from "const __nv_bfloat16" to "unsigned short" exists. However, when I switch to CUDA 12.1, the build works successfully.
To Reproduce
Steps to reproduce the behavior:
- Build with 'cuda-12.2'
building 'fvdb.fvdblib' extension
/bin/nvcc -I/openvdb/fvdb/src -I/openvdb/fvdb/../nanovdb -I/openvdb/fvdb/external/cutlass/include -I/openvdb/fvdb/external/c-blosc/install/include -I/openvdb/fvdb/external/cudnn_fe/include -I/openvdb/fvdb/external/cudnn/cudnn-linux-x86_64-9.1.0.70_cuda12-archive/include -I/lib/python3.12/site-packages/torch/include -I/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/lib/python3.12/site-packages/torch/include/TH -I/lib/python3.12/site-packages/torch/include/THC -I/cuda-merged-12.2/include -I/include -I/python3-3.12.4/include/python3.12 -c src/detail/GridBatchImpl.cu -o build/temp.linux-x86_64-cpython-312/src/detail/GridBatchImpl.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++20 --extended-lambda --diag-suppress=186 -diag-suppress=3189 -Xfatbin -compress-all -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=fvdblib -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 -ccbin gcc
- See error
/include/cuda_bf16.hpp(4132): error: no suitable conversion function from "const __nv_bfloat16" to "unsigned short" exists
{ __nv_bfloat16 minval; minval = (__hle(a, b) || __hisnan(b)) ? a : b; if (__hisnan(minval)) { minval = __ushort_as_bfloat16((unsigned short)0x7FFFU); } else if (__heq(a, b)) { __nv_bfloat16_raw ra = __nv_bfloat16_raw(a); __nv_bfloat16_raw rb = __nv_bfloat16_raw(b); minval = (ra.x > rb.x) ? a : b; } return minval; }
^
14 errors detected in the compilation of "src/detail/GridBatchImpl.cu".
error: command '/bin/nvcc' failed with exit code 2
Expected behavior
(A clear and concise description of what you expected to happen.)
Additional context
(Add any other context about the problem here.)
By the way, the early access program also seems to be broken:
pip install -U fvdb -f https://fvdb.huangjh.tech/whl/torch-${TORCH_VERSION}+${CUDA_VERSION}.html
using TORCH_VERSION=2.0.0 and CUDA_VERSION=cu121, as well as other versions.
Hope to explore more about this incredible work.
Hi @yzx9
Sorry, yes, we currently only have support for CUDA 12.1 We are currently working on 12.4 support to bring the supported versions in line with the officially built PyTorch packages' CUDA versions.
Yes, sorry, those pip wheels install instructions were from a very old set of packages and those instructions are out of date. I have removed their mentions.
Hi @yzx9
We are building for CUDA 12.4 now so that we support the CUDA 12 versions which PyTorch builds against in their official distributions. By extension, CUDA 12.2 should work now, let me know if you find that not to be the case.
Hi @yzx9
We are building for CUDA 12.4 now so that we support the CUDA 12 versions which PyTorch builds against in their official distributions. By extension, CUDA 12.2 should work now, let me know if you find that not to be the case.
Hi,When will the early access could be released, i cannot install fvdb on a isolated training server.
@swahtz thanks! it works