FasterTransformer Compatibility issue with CUDA 12.2

Branch/Tag/Commit

main

Docker Image Version

N/A

GPU name

A100

CUDA Driver

535.54.03

Reproduced Steps

Install CUDA 12.2 and newest driver, make -j12 would exit w/ two problems:

1. FasterTransformer/src/fastertransformer/utils/cuda_type_utils.cuh(328): error: more than one instance of overloaded function "fabs" matches the argument list:
            function "std::fabs(long double)" (declared at line 245 of /usr/include/c++/9/cmath)
            function "std::fabs(float)" (declared at line 241 of /usr/include/c++/9/cmath)
            argument types are: (__nv_bfloat16)
  template<> __attribute__((device)) inline __nv_bfloat16 cuda_abs(__nv_bfloat16 val) { return fabs(val); }

2. FasterTransformer/src/fastertransformer/kernels/unfused_attention_kernels.cu(925): error: more than one operator "*" matches these operands:
            function "operator*(const __nv_bfloat162 &, const __nv_bfloat162 &)" (declared at line 812 of /usr/local/cuda/include/cuda_bf16.hpp)
            function "fastertransformer::operator*(__nv_bfloat162, __nv_bfloat162)" (declared at line 172 of /users/myan/FasterTransformer/src/fastertransformer/utils/cuda_bf16_fallbacks.cuh)
            operand types are: __nv_bfloat162 * const __nv_bfloat162
              k = k * ia3_key_weights[ia3_task * n + idx];
                    ^
          detected during:
            instantiation of "void fastertransformer::add_QKV_bias_rebuild_padding_ia3(const T *, const T *, const T *, const T *, const T *, const T *, T *, T *, T *, const int *, const T *, const T *, int, int, int, int, const int *) [with T=__nv_bfloat162]" at line 1083
            instantiation of "void fastertransformer::invokeAddQKVBiasIA3RebuildPadding(T *, const T *, T *, const T *, T *, const T *, T *, T *, T *, int, int, int, int, int, const int *, const int *, const T *, const T *, cudaStream_t) [with T=__nv_bfloat16]" at line 1133