Compatibility issue with CUDA 12.2
Branch/Tag/Commit
main
Docker Image Version
N/A
GPU name
A100
CUDA Driver
535.54.03
Reproduced Steps
Install CUDA 12.2 and newest driver, make -j12 would exit w/ two problems:
1. FasterTransformer/src/fastertransformer/utils/cuda_type_utils.cuh(328): error: more than one instance of overloaded function "fabs" matches the argument list:
function "std::fabs(long double)" (declared at line 245 of /usr/include/c++/9/cmath)
function "std::fabs(float)" (declared at line 241 of /usr/include/c++/9/cmath)
argument types are: (__nv_bfloat16)
template<> __attribute__((device)) inline __nv_bfloat16 cuda_abs(__nv_bfloat16 val) { return fabs(val); }
2. FasterTransformer/src/fastertransformer/kernels/unfused_attention_kernels.cu(925): error: more than one operator "*" matches these operands:
function "operator*(const __nv_bfloat162 &, const __nv_bfloat162 &)" (declared at line 812 of /usr/local/cuda/include/cuda_bf16.hpp)
function "fastertransformer::operator*(__nv_bfloat162, __nv_bfloat162)" (declared at line 172 of /users/myan/FasterTransformer/src/fastertransformer/utils/cuda_bf16_fallbacks.cuh)
operand types are: __nv_bfloat162 * const __nv_bfloat162
k = k * ia3_key_weights[ia3_task * n + idx];
^
detected during:
instantiation of "void fastertransformer::add_QKV_bias_rebuild_padding_ia3(const T *, const T *, const T *, const T *, const T *, const T *, T *, T *, T *, const int *, const T *, const T *, int, int, int, int, const int *) [with T=__nv_bfloat162]" at line 1083
instantiation of "void fastertransformer::invokeAddQKVBiasIA3RebuildPadding(T *, const T *, T *, const T *, T *, const T *, T *, T *, T *, int, int, int, int, int, const int *, const int *, const T *, const T *, cudaStream_t) [with T=__nv_bfloat16]" at line 1133
Have you resolved this error? I found this similar error when I tried to “make -j12”.
I disabled all occurrences of bf16. If you need to use bf16 then I am not sure.
I tried this but the problem is not be solved. but thank you!
I am facing the same issue. Has anyone found the fix yet?
I am facing the same issue. Has anyone found the fix yet?
I finally solved the problem and realized it was that when I was compiling, it should have been -DSM=80 and I wrote -DSM=8.0。A silly mistake hhh
I'm using cuda 12.6 and facing the same issue,does anyone knows how to solve it?