Build Failure During Tensile Libraries Generation
Local ROCm version: 5.2.5.1 hipBLASLt version used in build: release/rocm-rel-5.5 Python version: 3.10 CPU: POWER9 GPU: gfx906
The hipBLASLt requirement arose for us re: bitsandbytes-rocm/ops.cu:400 that is required for 8-bit loading of HuggingFace language models. Unfortunately, the current implementation seems to rely on hipBLASLt for 8-bit matmul, and lacks in 4-bit implementation. Would you say that for gfx906/gfx908, hipBLASLt provides an advantage in 8-bit or 4-bit inference compared to hipBLAS code?
During the build process, the following commands were used:
CMake command: `cmake -DAMDGPU_TARGETS=gfx906 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_C_COMPILER=hipcc -G "Unix Makefiles" ..`
Make command: `make -j16`
CMake did not report any errors. However, the build failed at the "Generating Tensile Libraries" target, immediately after displaying the message "Reading logic files: Launching 32 threads...". The build failure persists even when configuring using install.sh with AMDGPU_TARGETS hardcoded to gfx906.
traceback:
rocminfo: rocminfo.txt
Update: Seems the same error appears when compiling with ROCm 5.5.
@hovertank3d hipBLASLt only support gfx90a so far. You can find the supported data types and hw requirement from Readme.
@hovertank3d Please check if your issue still occurs with the latest ROCm 6.1.2? If not, please close the ticket. Thanks!
@hovertank3d Closing ticket. Please feel free to re-open ticket if you need assistance. Thanks!