[Issue]: what's correct way to use --plugins att auto? "Could not find att output kernel: ./*_kernel.txt"
Problem Description
I write a simple vec_add hip application and try to disassemble hip code to ISA code :
// HIP kernel. Each thread takes care of one element of c global void vecAdd(double *a, double *b, double c, int n) { // Get our global thread ID int id = blockIdx.xblockDim.x+threadIdx.x; // Make sure we do not go out of bounds if (id < n) c[id] = a[id] + b[id]; }
int main(...){..}
run with: sudo rocprofv2 -i ./input.txt --plugin att auto --mode csv ./vadd_hip
it's output: Could not find att output kernel: ./*_kernel.txt
the doc said that On ROCm 6.0, ATT enables automatic capture of the ISA during kernel execution, and does not require recompiling. It is recommeneded to leave at "auto".
rocprofv2 -i input.txt --plugin att auto --mode csv <app_relative_path>
Did I miss any necessary steps ? I read the att.py code, it seems do nothing to auto capture the kernel code .
DEVICE:
name: gfx1030
Marketing Name: AMD Radeon RX 6900 XT
Operating System
Ubuntu
CPU
AMD® Ryzen 7 5700g with radeon graphics × 16
GPU
AMD Radeon VII
ROCm Version
ROCm 6.1.0
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
Hi @oldxie The parameter ISA_CAPTURE_MODE={0,1,2} (input.txt) controls what code object is captured, as .out files, while --plugin att auto causes att.py to use that captured information. "Could not find att output kernel: ./*_kernel.txt" means the post-processing script could not find any data generated by the profiler, which are *.att and *_kernel.txt files.
On successful data generation, you'll see messages of the type "--- collecting data for shader_engine [...]" during your application run. Which kernel generates data is controlled by input.txt, for "vecAdd" try this:
att: TARGET_CU=1 KERNEL=vecAdd
@oldxie Has your issue been resolved? If so, please close the ticket. Thanks!
hi @pbhandar-amd it's ok for hip code now , thanks
Can it also work for OpenCL kernel ?
the input file like : pmc: SQ_WAVES,SQ_WAVE_CYCLES,SQ_BUSY_CYCLES att: ISA_CAPTURE_MODE=2 att: TARGET_CU=1 KERNEL=MIOpenBatchNormFwdInferSpatialEst
MIOpenBatchNormFwdInferSpatialEst is the OpenCL kernel name in MIOpen Library, the output csv
Addr,Instruction,Hitcount,Cycles,C++ Reference 0x0,; Begin ATT ASM,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0,
there is no ISA code generated, and some waring info in the terminal printout :
Parsing: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt ['MIOpenBatchNormFwdInferSpatialEst.kd dispatch[1] GPU[0]: MIOpenBatchNormFwdInferSpatialEst.kd', '0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160', '0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480'] Att kernel: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt Warning: Could not open 0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480 Warning: Could not open 0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Generating att_output_MIOpenBatchNormFwdInferSpatialEst.kd_v0.csv
hi @pbhandar-amd it's ok for hip code now , thanks
Can it also work for OpenCL kernel ?
the input file like : pmc: SQ_WAVES,SQ_WAVE_CYCLES,SQ_BUSY_CYCLES att: ISA_CAPTURE_MODE=2 att: TARGET_CU=1 KERNEL=MIOpenBatchNormFwdInferSpatialEst
MIOpenBatchNormFwdInferSpatialEst is the OpenCL kernel name in MIOpen Library, the output csv
Addr,Instruction,Hitcount,Cycles,C++ Reference 0x0,; Begin ATT ASM,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0, 0x0,; Addr #0x7f8cb9465d00,0,0,
there is no ISA code generated, and some waring info in the terminal printout :
Parsing: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt ['MIOpenBatchNormFwdInferSpatialEst.kd dispatch[1] GPU[0]: MIOpenBatchNormFwdInferSpatialEst.kd', '0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160', '0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480'] Att kernel: ./MIOpenBatchNormFwdInferSpatialEst.kd_v0_kernel.txt Warning: Could not open 0x7f01ea57e000 0x2070 2 memory://3174610#offset=0x20d6e30&size=5480 Warning: Could not open 0x7f01ea588000 0xa670 1 memory://3174610#offset=0x15e6f30&size=37160 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Codeobj API lookup: segment addr out of range Auto error invalid addr 0x7f01ea57fd00 Generating att_output_MIOpenBatchNormFwdInferSpatialEst.kd_v0.csv
Hi, PMC is not supported in the same run as ATT (or ATT + any other tracing mode). They are different parameter formats, and use different plugins and queue intercepts. Use e.g.
att: TARGET_CU=1 ISA_CAPTURE_MODE=2 KERNEL=MIOpenBatchNormFwdInferSpatialEst
This is getting fixed in rocprofv3, but it's still not recommended because counter collection changes the results (adds serialization and some dispatch overhead). ATT is meant to not alter the application run when profiling multiple kernels (DISPATCH_RANGE).
For the "Auto error invalid addr 0x7f01ea57fd00", what this means is some code is running at this address but the scripts couldn't find a codeobj at the address. Probably because of the other warning "Warning: Could not open 0x7f01ea57e000". ATT has no specific dependency on HIP, but we do depend of code objects being ELF files using HSA's codeobj upload/freeze mechanism. Can you check if running /opt/rocm/llvm/bin/llvm-objdump --disassemble-all works on the generated .out files and if they contain your kernel?
One more thing: is the issue still present on ROCm 6.2?
Hi @oldxie, this issue will be closed for now due to inactivity. Please feel free to reopen for more follow-ups. Thanks!