rocprofiler icon indicating copy to clipboard operation
rocprofiler copied to clipboard

[Issue]: what's correct way to use --plugins att auto? "Could not find att output kernel: ./*_kernel.txt"

Open oldxie opened this issue 1 year ago • 6 comments

Problem Description

I write a simple vec_add hip application and try to disassemble hip code to ISA code :

// HIP kernel. Each thread takes care of one element of c global void vecAdd(double *a, double *b, double c, int n) { // Get our global thread ID int id = blockIdx.xblockDim.x+threadIdx.x; // Make sure we do not go out of bounds if (id < n) c[id] = a[id] + b[id]; }

int main(...){..}

run with: sudo rocprofv2 -i ./input.txt --plugin att auto --mode csv ./vadd_hip

it's output: Could not find att output kernel: ./*_kernel.txt

the doc said that On ROCm 6.0, ATT enables automatic capture of the ISA during kernel execution, and does not require recompiling. It is recommeneded to leave at "auto".

rocprofv2 -i input.txt --plugin att auto --mode csv <app_relative_path>

Did I miss any necessary steps ? I read the att.py code, it seems do nothing to auto capture the kernel code .

DEVICE: name: gfx1030
Marketing Name: AMD Radeon RX 6900 XT

Operating System

Ubuntu

CPU

AMD® Ryzen 7 5700g with radeon graphics × 16

GPU

AMD Radeon VII

ROCm Version

ROCm 6.1.0

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

oldxie avatar Jun 14 '24 07:06 oldxie

Hi @oldxie The parameter ISA_CAPTURE_MODE={0,1,2} (input.txt) controls what code object is captured, as .out files, while --plugin att auto causes att.py to use that captured information. "Could not find att output kernel: ./*_kernel.txt" means the post-processing script could not find any data generated by the profiler, which are *.att and *_kernel.txt files.

On successful data generation, you'll see messages of the type "--- collecting data for shader_engine [...]" during your application run. Which kernel generates data is controlled by input.txt, for "vecAdd" try this:

att: TARGET_CU=1 KERNEL=vecAdd

ApoKalipse-V avatar Jul 16 '24 21:07 ApoKalipse-V

@oldxie Has your issue been resolved? If so, please close the ticket. Thanks!

ppanchad-amd avatar Aug 29 '24 15:08 ppanchad-amd

hi @pbhandar-amd it's ok for hip code now , thanks

Can it also work for OpenCL kernel ?

the input file like : pmc: SQ_WAVES,SQ_WAVE_CYCLES,SQ_BUSY_CYCLES att: ISA_CAPTURE_MODE=2 att: TARGET_CU=1 KERNEL=MIOpenBatchNormFwdInferSpatialEst

MIOpenBatchNormFwdInferSpatialEst is the OpenCL kernel name in MIOpen Library, the output csv

Addr,Instruction,Hitcount,Cycles,C++ Reference 0x0,; Begin ATT ASM,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0,

there is no ISA code generated, and some waring info in the terminal printout :

Parsing: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt ['MIOpenBatchNormFwdInferSpatialEst.kd dispatch[1] GPU[0]: MIOpenBatchNormFwdInferSpatialEst.kd', '0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160', '0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480'] Att kernel: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt Warning: Could not open 0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480 Warning: Could not open 0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Generating att_output_MIOpenBatchNormFwdInferSpatialEst.kd_v0.csv

oldxie avatar Aug 30 '24 08:08 oldxie

hi @pbhandar-amd it's ok for hip code now , thanks

Can it also work for OpenCL kernel ?

the input file like : pmc: SQ_WAVES,SQ_WAVE_CYCLES,SQ_BUSY_CYCLES att: ISA_CAPTURE_MODE=2 att: TARGET_CU=1 KERNEL=MIOpenBatchNormFwdInferSpatialEst

MIOpenBatchNormFwdInferSpatialEst is the OpenCL kernel name in MIOpen Library, the output csv

Addr,Instruction,Hitcount,Cycles,C++ Reference 0x0,; Begin ATT ASM,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0,

there is no ISA code generated, and some waring info in the terminal printout :

Parsing: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt ['MIOpenBatchNormFwdInferSpatialEst.kd dispatch[1] GPU[0]: MIOpenBatchNormFwdInferSpatialEst.kd', '0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160', '0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480'] Att kernel: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt Warning: Could not open 0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480 Warning: Could not open 0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Generating att_output_MIOpenBatchNormFwdInferSpatialEst.kd_v0.csv

oldxie avatar Aug 30 '24 08:08 oldxie

Hi, PMC is not supported in the same run as ATT (or ATT + any other tracing mode). They are different parameter formats, and use different plugins and queue intercepts. Use e.g.

att: TARGET_CU=1 ISA_CAPTURE_MODE=2 KERNEL=MIOpenBatchNormFwdInferSpatialEst

This is getting fixed in rocprofv3, but it's still not recommended because counter collection changes the results (adds serialization and some dispatch overhead). ATT is meant to not alter the application run when profiling multiple kernels (DISPATCH_RANGE).

For the "Auto error invalid addr 0x7f01ea57fd00", what this means is some code is running at this address but the scripts couldn't find a codeobj at the address. Probably because of the other warning "Warning: Could not open 0x7f01ea57e000". ATT has no specific dependency on HIP, but we do depend of code objects being ELF files using HSA's codeobj upload/freeze mechanism. Can you check if running /opt/rocm/llvm/bin/llvm-objdump --disassemble-all works on the generated .out files and if they contain your kernel?

ApoKalipse-V avatar Sep 04 '24 16:09 ApoKalipse-V

One more thing: is the issue still present on ROCm 6.2?

ApoKalipse-V avatar Sep 04 '24 16:09 ApoKalipse-V

Hi @oldxie, this issue will be closed for now due to inactivity. Please feel free to reopen for more follow-ups. Thanks!

tcgu-amd avatar Oct 07 '24 18:10 tcgu-amd