[Issue]: tensilelite/Tensile/bin/Tensile breaks in recent builds
Problem Description
Cloned using git clone https://github.com/ROCm/hipBLASLt.git some weeks ago, updated today using git pull, and tensilelite/Tensile/bin/Tensile no longer works after running an install.
cd hipblaslt ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942/Equality/* # Fairly fast, restricted library to test build
reports /workspace/hipBLASLt/clients/gtest/../include/testing_matmul.hpp:377:10: warning: case value not in enumerated type 'hipDataType' [-Wswitch] 377 | case HIP_R_4F_E2M1_EXT:
:~/hipBLASLt# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile ImportError: cannot import name 'rocIsa' from 'rocisa' (unknown location)
More detailed repro steps and comparison with working process earlier below.
Operating System
Ubuntu 22.04.5 LTS (Jammy Jellyfish)
CPU
Intel(R) Xeon(R) Platinum 8480C
GPU
Other
Other
MI300X, MI308
ROCm Version
ROCm 6.2.3
ROCm Component
hipBLASLt
Steps to Reproduce
Use recent docker image eg
docker run -it --network=host --group-add=video --privileged --ipc=host
--cap-add=SYS_PTRACE --security-opt seccomp=unconfined
--device /dev/kfd --device /dev/dri --shm-size=20gb
-e HIP_FORCE_DEV_KERNARG=1
-e TORCH_BLAS_PREFER_HIPBLASLT=1
-v /workspace -w /workspace
--name hb_build rocm/pytorch:latest
docker exec -it hb_build bash
Inside hb_build docker container:
sudo apt update sudo apt install python3.10-venv sudo apt install hipblaslt
git clone https://github.com/ROCm/hipBLASLt.git
Today that sync'd my client to e9fa8851fbbb1441b67ef0f9c42bdcae8318a7f7
cd hipblaslt ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942/Equality/* # Fairly fast, restricted library to test build /workspace/hipBLASLt/clients/gtest/../include/testing_matmul.hpp:377:10: warning: case value not in enumerated type 'hipDataType' [-Wswitch] 377 | case HIP_R_4F_E2M1_EXT:
:~/hipBLASLt# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile ImportError: cannot import name 'rocIsa' from 'rocisa' (unknown location)
What worked earlier: Revert to this commit: git checkout 200c0540 ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942/Equality/* # Fairly fast restricted library to test build :~/hipBLASLt# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile
# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile
################################################################################ # # Tensile v4.33.0 # .... working as expected
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
Actually encountered on rocm 6.3.4, according to # dpkg -l | grep rocm-core ii rocm-core 6.3.4.60304-76~22.04
I selected 6.2.3 as the most recent option available in the list above, but please can the more up-to-date versions be added to the options?
Hi @dwiddows - new python bindings were added to the library. Try from the tensilelite directory pip install rocisa/. Note that rocisa is a subdirectory in tensilelite with a setup.py. There is a readme in tensilelite with instructions for installing via cmake if you prefer to go that route.
Thanks. Please can the pip install rocisa/ step be added to the ./install.sh process, or is the suggestion to build tensilelite separately via cmake going to become the standard method?
After pip install rocisa/ in the tensilelite directory followed by ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942_80cu/Equality/*, I still see the warnings like
/workspace/hipBLASLt/library/src/include/auxiliary.hpp:144:10: warning: case value not in enumerated type 'hipDataType' [-Wswitch] 144 | case HIP_R_4F_E2M1_EXT:
but ./tensilelite/Tensile/bin/Tensile gives as expected: ################################################################################ # # Tensile v4.33.0 usage: Tensile [-h] [--version] [--alternate-format] [--use-cache] [-d DEVICE] [-p PLATFORM] [--runtime-language {HIP,OCL}]
This issue has been migrated to: https://github.com/ROCm/rocm-libraries/issues/314
Closing the issue in this repo. Please refer to the migrated issue for updates.