HIP icon indicating copy to clipboard operation
HIP copied to clipboard

[Issue] Unable to profile HIP application with RDP

Open jeromew opened this issue 11 months ago • 5 comments


  • window 11
  • Radeon 6700 XT
  • latest drivers
  • HIP SDK 6.2

I am sorry if this is not the right place to ask for help on this issue but it seems my problem is related to a HIP application so maybe there is some HIP knowledge that I am missing.

I have been trying to profile a HIP application with Radeon Developer Panel and I always get

(19:02:58.566) INFO [RGP Trace Source - PID: 15348] Client connected [6446 HIP]
(19:02:58.628) INFO [RGP Trace Source - PID: 15348] Client reached init state [6446 HIP]
(19:02:58.707) INFO [RGP Trace Source - PID: 15348] Successfully enabled tracing [6446 HIP]
(19:02:58.707) INFO [RGP Trace Source - PID: 15348] Initialized new client [6446 HIP]
(19:02:58.768) INFO [DDToolConn] Successfully initialized driver (connection id: 6446).
(19:02:58.897) INFO [RGP Trace Source - PID: 15348] Successfully queried SPM counters [6446 HIP]
(19:02:58.897) INFO [RGP Trace Source - PID: 15348] Successfully updated SPM counters [6446 HIP]
(19:02:59.148) INFO [RGP Trace Source - PID: 15348] Successfully began trace [6446 HIP]
(19:02:59.741) INFO [RGP Trace Source - PID: 15348] Client disconnected [6446 HIP]
(19:02:59.752) ERROR [RGP Trace Source - PID: 15348] Failed to capture trace [6446 HIP]
(19:03:00.739) INFO [RGP Trace Source - PID: 15348] Finished disconnecting client [6446 HIP]

The profiling seems to start but RDP fails to capture the trace. This result is observed even with a very simple application like

#include <hip/hip_runtime.h>

#include <iostream>

#define HIP_CHECK(expression)                  \
{                                              \
    const hipError_t status = expression;      \
    if(status != hipSuccess){                  \
        std::cerr << "HIP error "              \
                  << status << ": "            \
                  << hipGetErrorString(status) \
                  << " at " << __FILE__ << ":" \
                  << __LINE__ << std::endl;    \
    }                                          \
}


__device__ unsigned int get_thread_idx()
{
    return threadIdx.x;
}

__host__ void print_hello_host()
{
    std::cout << "Hello world from host!" << std::endl;
}

__device__ __host__ void print_hello()
{
    printf("Hello world from device or host!\n");
}

__global__ void helloworld_kernel()
{
    unsigned int thread_idx = get_thread_idx();
    unsigned int block_idx = blockIdx.x;

    print_hello();

    printf("Hello world from device kernel block %u thread %u!\n", block_idx, thread_idx);
}

int main()
{
    print_hello_host();

    print_hello();

    helloworld_kernel<<<dim3(2), // 3D grid specifying number of blocks to launch: (2, 1, 1)
                        dim3(2), // 3D grid specifying number of threads to launch: (2, 1, 1)
                        0, // number of bytes of additional shared memory to allocate
                        hipStreamDefault // stream where the kernel should execute: default stream
                        >>>();

    HIP_CHECK(hipDeviceSynchronize());
}

Is there a specific flag or environment variable that needs to be set before calling hipcc in order to be able to capture the profiling trace ? Is there something I could do to understand what is happening when the capture fails ?

jeromew avatar Jan 22 '25 18:01 jeromew

Hi @jeromew, an internal ticket has been created to investigate this issue.

harkgill-amd avatar Jan 22 '25 19:01 harkgill-amd

I can add that I tested profiling on my setup with a vulkan demo application called vkcube.exe found in the Vulkan SDK and the profiling capture works.

so I am inclined to think this is related to HIP / Compute despite the fact that the doc on https://gpuopen.com/manuals/rdp_manual/rdp_manual-index/ states

Compute APIs, RDNA hardware, and operating systems

Supported APIs
    OpenCL
    HIP

Supported RDNA hardware
    AMD Radeon RX 7000 series
    AMD Radeon RX 6000 series
    AMD Radeon RX 5000 series
    AMD Ryzen Processors with Radeon Graphics

Supported Operating Systems
    Windows® 10
    Windows® 11

so it should pass the Windows 11 / Radeon RX 6700 XT / HIP combination

jeromew avatar Jan 23 '25 13:01 jeromew

Hi @jeromew, thanks for reporting this! I was able to reproduce the issue with the HIP SDK matrix transpose sample on a 7900XTX. I believe this should be supported and we're looking into it.

schung-amd avatar Jan 30 '25 19:01 schung-amd

@schung-amd for information I managed to get some data out of the profiling pipeline.

It seems that if in RDP you request more dispatches than your application really has then the capture fails. Trying to get only 1 dispatch I could start getting a profile (but not always).

The next thing is that when the application exits, it seems that RDP does not properly terminate the profiling if it was not finished earlier. Adding a Sleep for 20 seconds at the end of the application seem to allow RDP to gracefully finish the profiling.

These constraints are not very well documented / not mentioned in the FAQ.

jeromew avatar Feb 04 '25 16:02 jeromew

Interesting, thanks for looking further into this! I'll see if the internal team is aware of this and if we should fix this in code or documentation.

schung-amd avatar Feb 04 '25 16:02 schung-amd

This issue has been migrated to: https://github.com/ROCm/rocm-systems/issues/395

systems-assistant[bot] avatar Aug 18 '25 18:08 systems-assistant[bot]