hipBLASLt icon indicating copy to clipboard operation
hipBLASLt copied to clipboard

How to unwrap .dat files to examine assembly?

Open mysoreanoop opened this issue 9 months ago • 10 comments

When running hipblaslt-bench, I am able to observe certain kernels being launched like named below. I want to examine the corresponding assembly code. When I grep for this string, I see it appearing in hipblaslt/build/release/Tensile/library/*.dat file. How can I unwrap the assembly from there? If not, is there any other way I can get the assembly execution trace?

Cijk_Ailk_Bljk_BBS_BH_Bias_HAS_SAV_UserArgs_MT160x160x64_MI16x16x1_SN_LDSB1_AFC1_AFEM1_AFEM1_ASEM1_CLR1_CADS0_DTVA0_DTVB0_EPS0_FDSI0_GRPM1_GRVWA8_GRVWB8_GSUAMB_GLS0_ISA90a_K1_LBSPPA2560_LBSPPB128_LBSPPM0_LPA16_LPB16_LPM0_LRVW8_LWPMn1_MIAV0_MIWT5_5_MO40_NTn1_NTA0_NTB0_NTC0_NTD0_NTM0_NEPBS0_NLCA5_NLCB1_ONLL1_PGR2_PLR1_PKA1_SIA3_SS1_SPO0_SRVW0_SSO0_SVW1_SK0_SKXCCM0_TLDS1_ULSGRO0_USL1_UIOFGRO0_USFGROn1_VSn1_VWA1_VWB1_WSGRA0_WSGRB1_WG32_8_1

mysoreanoop avatar Mar 12 '25 20:03 mysoreanoop

I see the associated .co file with the same file name. Is there a way to disassemble it?

mysoreanoop avatar Mar 12 '25 21:03 mysoreanoop

Hi @mysoreanoop, you can use llvm-objdump to examine the assembly code associated with a .co file. For example,

/opt/rocm/llvm/bin/llvm-objdump --disassemble <.co file>

For more information on this, you can reference the documentation at llvm-objdump.

harkgill-amd avatar Mar 14 '25 18:03 harkgill-amd

/opt/rocm/llvm/bin/llvm-objdump: error: '../../Tensile/library/TensileLibrary_BB_BB_HA_Bias_SAV_Type_BB_HPA_Contraction_l_Alik_Bljk_Cijk_Dijk_gfx90a.co': The file was not recognized as a valid object file

I tried it with various .co files. This is also what was selected when I invoked hipblaslt-bench.

ROCm v6.3, gfx90a

mysoreanoop avatar Mar 14 '25 19:03 mysoreanoop

@mysoreanoop try unbundling first.

> clang-offload-bunder --type=o --input=my.co --list
hipv4-amdgcn-amd-amdhsa-unknown-gfx90a
host-x86_64-unknown-linux-gnu-

> clang-offload-bunder --targets=hipv4-amdgcn-amd-amdhsa-unknown-gfx90a --type=o --input=my.co --output=temp.o --unbundle

> nm -D temp.o
0000000000017740 R Cijk_Alik_Bljk_BBS_BH_Bias_HAS_SAV_UserArgs_MT96x224x32_MI16x16x1_SN_LDSB1_AFC1_AFEM1_AFEM1_ASEM1_CLR1_CADS0_DTVA0_EPS0_GRPM1_GRVWA4_GRVWB4_GSUAMB_ISA90a_IU1_K1_LBSPPA128_LBSPPB128_LBSPPM0_LPA4_LPB4_LPM0_LRVW4_LWPMn1_MIAV0_MIWT3_7_MO40_NTn1_NTA0_NTB0_NTC0_NTD0_NTM0_NEPBS2_NLCA1_NLCB1_ONLL1_PGR2_PLR1_PKA1_SIA3_SS1_SPO1_SRVW0_SSO0_SVW1_TLDS1_ULSGRO0_USL1_UIOFGRO0_USFGROn1_VSn1_VWA1_VWB1_WSGRA1_WSGRB1_WS64_WG32_8_1.kd

I haven't tried using objdump but if you want to see the assembly you can build with --keep-build-tmp. Then search for a directory called build_tmp. All of the assembly files are in this directory.

davidd-amd avatar Mar 15 '25 17:03 davidd-amd

@mysoreanoop please replace .dat with .co in the title. The .dat file is a metadata file that uses the msgpack format. It doesn't have any assembly. The .co file is the binary loaded and executed at runtime.

davidd-amd avatar Mar 15 '25 17:03 davidd-amd

try unbundling first.

That worked to show the .kd files; Couldn't find the exact kernel, but similar ones can be then disassembled with llvm-objdump ! Didn't try the --keep-build-tmp

Thank you!

mysoreanoop avatar Mar 15 '25 21:03 mysoreanoop

I rebuilt hipblaslt with --keep-build-tmp, and I do see that the new build directory (46 GB) is larger than without that flag, built for gfx90a (and 942), I don't find any build_tmp or build* directory within build/release . Also, when I find .s, the only hits are build/release/virtualenv/lib/python3.10/site-packages/Tensile/CustomKernels/Custom_Cijk_*.s

Am I missing something?

mysoreanoop avatar Mar 21 '25 21:03 mysoreanoop

A low level related question:

  1. When I invoke hipblaslt-bench with some parameters, would different waves have potentially different execution traces? Cause the disassembly file that I got from the above method is huge, and the basic blocks are not super recognizable (I was expecting tensile-like structure). I'm wondering if it's structured to have a highly parameterized binary that switches based on different parameters, workgroup IDs, etc. perhaps?
  2. Also, is there a way to obtain flattened out representation of asm instructions in the disassembly (something like according to the supplied parameters) other than getting the exec trace? I'm trying to debug something in gem5, and this would help.

mysoreanoop avatar Mar 25 '25 22:03 mysoreanoop

I rebuilt hipblaslt with --keep-build-tmp, and I do see that the new build directory (46 GB) is larger than without that flag, built for gfx90a (and 942), I don't find any build_tmp or build* directory within build/release . Also, when I find .s, the only hits are build/release/virtualenv/lib/python3.10/site-packages/Tensile/CustomKernels/Custom_Cijk_*.s

Am I missing something?

There should be a Tensile/build_tmp directory in build/release.

davidd-amd avatar Apr 10 '25 23:04 davidd-amd

A low level related question:

  1. When I invoke hipblaslt-bench with some parameters, would different waves have potentially different execution traces? Cause the disassembly file that I got from the above method is huge, and the basic blocks are not super recognizable (I was expecting tensile-like structure). I'm wondering if it's structured to have a highly parameterized binary that switches based on different parameters, workgroup IDs, etc. perhaps?
  2. Also, is there a way to obtain flattened out representation of asm instructions in the disassembly (something like according to the supplied parameters) other than getting the exec trace? I'm trying to debug something in gem5, and this would help.

I would recommend creating a separate issue for these questions.

davidd-amd avatar Apr 10 '25 23:04 davidd-amd

Closing this issue out. @mysoreanoop, if you still need assistance with the questions, please file a separate issue.

harkgill-amd avatar May 05 '25 18:05 harkgill-amd