omnitrace
omnitrace copied to clipboard
omnitrace hangs before hostCallback function
Hello,
I am trying to trace my AMDGPU application with Omnitrace, but I am running into an issue with a host callback function. Using hipStreamAddCallback
I submit a host function into a stream. Without Omnitrace, the program works as expected. But with Omnitrace, the program hangs and the host function is never launched.
Reproducer program:
#include <cstdio>
#include <hip/hip_runtime.h>
#define CHECK(status) do { check((status), __FILE__, __LINE__); } while(false)
inline static void check(hipError_t error_code, const char *file, int line)
{
if (error_code != hipSuccess)
{
fprintf(stderr, "HIP Error %d %s: %s. In file '%s' on line %d\n", error_code, hipGetErrorName(error_code), hipGetErrorString(error_code), file, line);
fflush(stderr);
exit(error_code);
}
}
__global__ void dummy_kernel(int a)
{
printf("I am dummy kernel %d\n", a);
}
int main()
{
printf("AAA\n");
CHECK(hipDeviceSynchronize());
printf("BBB\n");
dummy_kernel<<< 1,1 >>>(1);
printf("CCC\n");
CHECK(hipDeviceSynchronize());
printf("DDD\n");
CHECK(hipStreamAddCallback(0, [](hipStream_t stream_, hipError_t status_, void * arg){
printf("I am host function\n");
}, nullptr, 0));
printf("EEE\n");
CHECK(hipDeviceSynchronize());
printf("FFF\n");
dummy_kernel<<< 1,1 >>>(2);
printf("GGG\n");
CHECK(hipDeviceSynchronize());
printf("HHH\n");
return 0;
}
When running it without omnitrace, the program correctly outputs
AAA
BBB
CCC
I am dummy kernel 1
DDD
EEE
I am host function
FFF
GGG
I am dummy kernel 2
HHH
but with omnitrace, it only outputs
AAA
BBB
CCC
I am dummy kernel 1
DDD
EEE
and then nothing, then it just hangs, seemingly forever.
I am compiling the program using
hipcc -g -O2 source.hip.cpp -o program.x
And runing using
omnitrace-sample -- ./program.x
omnitrace-instrument
seems to have the same problem.
I am on LUMI-G compute node (MI250x), using rocm-5.2.3 (the only one properly supported there, module load LUMI/23.03 rocm/5.2.3
).
I installed omnitrace using this guide, just running the installation script and adding the appropriate directories to PATH
and LD_LIBRARY_PATH
.
$ omnitrace-sample --version
omnitrace-sample v1.10.2 (rev: 0b751d2aef7d32d8b4fab184d0b34d4013b6d986, tag: v1.10.2, compiler: GNU v7.5.0, rocm: v5.2.x)
In case I missed any details, please ask.
I would appreciate any help.