FasterTransformer
FasterTransformer copied to clipboard
build failed with tf-op
Branch/Tag/Commit
v5.3
Docker Image Version
nvcr.io/nvidia/pytorch:22.12-py3
GPU name
T4
CUDA Driver
NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.8
Reproduced Steps
build with "cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_MULTI_GPU=ON -DBUILD_TF2=ON -DTF_PATH=/usr/local/lib/python3.8/dist-packages/tensorflow/ .." and make
error message:
/workspace/FasterTransformer/src/fastertransformer/kernels/disentangled_attention_kernels.cu(382): error: more than one operator "=" matches these operands:
function "__nv_bfloat16::operator=(float)"
/usr/local/cuda/include/cuda_bf16.hpp(178): here
function "__nv_bfloat16::operator=(double)"
/usr/local/cuda/include/cuda_bf16.hpp(181): here
operand types are: __nv_bfloat16 = int
detected during instantiation of "void fastertransformer::disentangled_attention_kernel<TDataType,tTileSize,tBlockDimY>(TDataType *, TDataType *, const TDataType *, const TDataType *, int32_t, int32_t, int32_t) [with TDataType=__nv_bfloat16, tTileSize=32, tBlockDimY=8]"
(407): here
/workspace/FasterTransformer/src/fastertransformer/kernels/disentangled_attention_kernels.cu(382): error: more than one operator "=" matches these operands:
function "__nv_bfloat16::operator=(float)"
/usr/local/cuda/include/cuda_bf16.hpp(178): here
function "__nv_bfloat16::operator=(double)"
/usr/local/cuda/include/cuda_bf16.hpp(181): here
operand types are: __nv_bfloat16 = int
detected during instantiation of "void fastertransformer::disentangled_attention_kernel<TDataType,tTileSize,tBlockDimY>(TDataType *, TDataType *, const TDataType *, const TDataType *, int32_t, int32_t, int32_t) [with TDataType=__nv_bfloat16, tTileSize=64, tBlockDimY=4]"
(407): here
2 errors detected in the compilation of "/workspace/FasterTransformer/src/fastertransformer/kernels/disentangled_attention_kernels.cu".
@jackzhou121 ,
I had same issues, and resolved them using following patch. This code casts the value into a double, ensuring that the compiler always chooses double operands for the "=" operator.
--- a/src/fastertransformer/kernels/disentangled_attention_kernels.cu
+++ b/src/fastertransformer/kernels/disentangled_attention_kernels.cu
@@ -379,7 +379,7 @@ __global__ void disentangled_attention_kernel(TDataType* result,
#ifdef ENABLE_BF16
else if constexpr (std::is_same<TDataType, __nv_bfloat16>::value) {
// bf16
- res = __hadd(res0, __hadd(res1, T[threadIdx.x][ty + threadIdx.y]));
+ res = static_cast<double>(__hadd(res0, __hadd(res1, T[threadIdx.x][ty + threadIdx.y])));
}
#endif