FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Not a name of any known instruction: "tanh"

Open oreo-lp opened this issue 3 years ago • 14 comments

Description

Branch:main
GPU: Tesla T4
CUDA: 10.2
TensorRT: 8.4.0.6
Docker: No use

Reproduced Steps

When I build the BERT(C++) project, there has a error.
1. mkdir build && cd build
2. cmake -DSM=75 -DCMAKE_BUILD_TYPE=Release ..
3. make

The error is below:
"[19%] Building CUDA object src/fastertransformer/kernels/CmakeFiles/activation_kernels.dir/activation_kernels.cu.o
ptxas /tmp/tmpxft_0001d8f9_00000000-5_activation_kernels.ptx, line 95; error:Not a name of any known instruction: 'tanh' ..."

oreo-lp avatar Aug 16 '22 06:08 oreo-lp

Thank you for the feedback. This ptx instruction is only supported after CUDA 11. We have add CUDA version checking to fix this bug. Please try again.

byshiue avatar Aug 16 '22 07:08 byshiue

Thank you for the feedback. This ptx instruction is only supported after CUDA 11. We have add CUDA version checking to fix this bug. Please try again.

Thanks! Now I am back to use the fastertransformer-4.0 version, and build the project, but there has another error: "Could Not find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)" how can I install the mpi?

oreo-lp avatar Aug 16 '22 07:08 oreo-lp

We suggest to use latest main branch. We have support the compiling without mpi now.

I don't know how to install mpi. We assume this is installed in your environment.

byshiue avatar Aug 16 '22 08:08 byshiue

We suggest to use latest main branch. We have support the compiling without mpi now.

I don't know how to install mpi. We assume this is installed in your environment.

Thanks!

oreo-lp avatar Aug 16 '22 08:08 oreo-lp

Now, I have change cuda10.2 to cuda11.3-cudnn 7.6.5.However, during build the project, there also have a error: "FasterTransformer/src/fastertransformer/utils/conv2d.h:50:20: error: 'CUDNN_DATA_BFLOAT16' was not declared in this scops dataType = CUDNN_DATA_BFLOAT16".

oreo-lp avatar Aug 16 '22 09:08 oreo-lp

Your CUDNN is too old and it does not support bfloat16. You can remove https://github.com/NVIDIA/FasterTransformer/blob/main/CMakeLists.txt#L20 to disable bfloat16.

byshiue avatar Aug 16 '22 09:08 byshiue

Thanks, I have change cudnn7.6.5 to 8.3.2, but there also have a error: "./FasterTransformer/examples/cpp/xlnet/cpy.h:19:10: fatal error:zlib.h: No such file or directory #include <zlib.h>"

oreo-lp avatar Aug 16 '22 12:08 oreo-lp

You can install the package by

sudo apt-get install zlib1g-dev

We suggest using the docker image we use in document to prevent such environment setting issues.

Besides, you can also use

cmake --build . --target bert_example

to replace

make -j

to build the object you need, but not whole project.

byshiue avatar Aug 16 '22 12:08 byshiue

Thanks. I wanna to use Docker. However, for some reason, this server cannot install docker.

oreo-lp avatar Aug 17 '22 01:08 oreo-lp

You can install the package by

sudo apt-get install zlib1g-dev

We suggest using the docker image we use in document to prevent such environment setting issues.

Besides, you can also use

cmake --build . --target bert_example

to replace

make -j

to build the object you need, but not whole project.

Hi, I have build this project successfully. But there have a error during running bert_gemm: ./bin/bert_gemm 1 32 12 64 0 0 and the error is below: /lib64/libstdc++.so.6: version 'CXXABI_1.3.8' not found (required by ./bin/bert_gemm). My gcc version is gcc7.3.0. What gcc version does this project suggest? By the way, when I run strings /lib64/libstdc++.so.6 | grep CXXABI, it shows: "CXXABI_1.3 CXXABI_1.3.1 .... CXXABI_1.3.7"

oreo-lp avatar Aug 17 '22 03:08 oreo-lp

I am not familiar to these environment settings. So, I cannot provide many helps.

The gcc in our docker is 9.4. You can try it.

byshiue avatar Aug 17 '22 04:08 byshiue

I am not familiar to these environment settings. So, I cannot provide many helps.

The gcc in our docker is 9.4. You can try it.

Thanks, I changed the libstdc++.so.6.0.19 to libstdc++.so.6.0.26, it's OK. However, there have a error when I run : ./bin/bert_gemm 1 32 32 12 64 0 0, the error is below: terminate called agter throwing an instance of 'std::runtime_error' what(): [FT][ERROR] CUDA runtime error: initialization error /FasterTransformer/src/fastertransformer/models/bert/bert_gemm.cc: 54

oreo-lp avatar Aug 17 '22 07:08 oreo-lp

The error happens at

cudaMemGetInfo

It should be a error for cuda installation or driver.

byshiue avatar Aug 17 '22 07:08 byshiue

Thanks, I'll check it later.

oreo-lp avatar Aug 17 '22 07:08 oreo-lp

Close this bug because it is inactivated. Feel free to re-open this bug if you still have any problem.

byshiue avatar Dec 02 '22 14:12 byshiue