hipBLASLt icon indicating copy to clipboard operation
hipBLASLt copied to clipboard

[Issue]: tensilelite/Tensile/bin/Tensile breaks in recent builds

Open dwiddows opened this issue 8 months ago • 2 comments

Problem Description

Cloned using git clone https://github.com/ROCm/hipBLASLt.git some weeks ago, updated today using git pull, and tensilelite/Tensile/bin/Tensile no longer works after running an install.

cd hipblaslt ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942/Equality/* # Fairly fast, restricted library to test build

reports /workspace/hipBLASLt/clients/gtest/../include/testing_matmul.hpp:377:10: warning: case value not in enumerated type 'hipDataType' [-Wswitch] 377 | case HIP_R_4F_E2M1_EXT:

:~/hipBLASLt# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile ImportError: cannot import name 'rocIsa' from 'rocisa' (unknown location)


More detailed repro steps and comparison with working process earlier below.

Operating System

Ubuntu 22.04.5 LTS (Jammy Jellyfish)

CPU

Intel(R) Xeon(R) Platinum 8480C

GPU

Other

Other

MI300X, MI308

ROCm Version

ROCm 6.2.3

ROCm Component

hipBLASLt

Steps to Reproduce

Use recent docker image eg docker run -it --network=host --group-add=video --privileged --ipc=host
--cap-add=SYS_PTRACE --security-opt seccomp=unconfined
--device /dev/kfd --device /dev/dri --shm-size=20gb
-e HIP_FORCE_DEV_KERNARG=1
-e TORCH_BLAS_PREFER_HIPBLASLT=1
-v /workspace -w /workspace
--name hb_build rocm/pytorch:latest

docker exec -it hb_build bash

Inside hb_build docker container:

sudo apt update sudo apt install python3.10-venv sudo apt install hipblaslt

git clone https://github.com/ROCm/hipBLASLt.git

Today that sync'd my client to e9fa8851fbbb1441b67ef0f9c42bdcae8318a7f7

cd hipblaslt ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942/Equality/* # Fairly fast, restricted library to test build /workspace/hipBLASLt/clients/gtest/../include/testing_matmul.hpp:377:10: warning: case value not in enumerated type 'hipDataType' [-Wswitch] 377 | case HIP_R_4F_E2M1_EXT:

:~/hipBLASLt# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile ImportError: cannot import name 'rocIsa' from 'rocisa' (unknown location)


What worked earlier: Revert to this commit: git checkout 200c0540 ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942/Equality/* # Fairly fast restricted library to test build :~/hipBLASLt# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile

# /workspace/hipBLASLt/tensilelite/Tensile/bin/Tensile

################################################################################ # # Tensile v4.33.0 # .... working as expected

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

Actually encountered on rocm 6.3.4, according to # dpkg -l | grep rocm-core ii rocm-core 6.3.4.60304-76~22.04

I selected 6.2.3 as the most recent option available in the list above, but please can the more up-to-date versions be added to the options?

dwiddows avatar Apr 07 '25 22:04 dwiddows

Hi @dwiddows - new python bindings were added to the library. Try from the tensilelite directory pip install rocisa/. Note that rocisa is a subdirectory in tensilelite with a setup.py. There is a readme in tensilelite with instructions for installing via cmake if you prefer to go that route.

davidd-amd avatar Apr 10 '25 23:04 davidd-amd

Thanks. Please can the pip install rocisa/ step be added to the ./install.sh process, or is the suggestion to build tensilelite separately via cmake going to become the standard method?

After pip install rocisa/ in the tensilelite directory followed by ./install.sh -dc -a gfx942 --logic-yaml-filter gfx942_80cu/Equality/*, I still see the warnings like

/workspace/hipBLASLt/library/src/include/auxiliary.hpp:144:10: warning: case value not in enumerated type 'hipDataType' [-Wswitch] 144 | case HIP_R_4F_E2M1_EXT:

but ./tensilelite/Tensile/bin/Tensile gives as expected: ################################################################################ # # Tensile v4.33.0 usage: Tensile [-h] [--version] [--alternate-format] [--use-cache] [-d DEVICE] [-p PLATFORM] [--runtime-language {HIP,OCL}]

dwiddows avatar Apr 10 '25 23:04 dwiddows

This issue has been migrated to: https://github.com/ROCm/rocm-libraries/issues/314

Closing the issue in this repo. Please refer to the migrated issue for updates.

idass1990 avatar Jun 20 '25 21:06 idass1990