aomp
aomp copied to clipboard
OpenMP HIP Memory interop issue
HIP to OpenMP
hipMalloc and pass the pointer to OpenMP target kernel via is_device_ptr works.
OpenMP to HIP
omp_target_alloc and pass the pointer to HIP API/kernel print error.
For example, https://github.com/ye-luo/miniqmc/blob/a2f826952f5045898bda855b34f37e164a30df24/src/Platforms/tests/OMPTarget/test_omp_memory_interop.cpp#L91
hipMemset refuses to run.
This prevents me calling hip/roc libraries in real applications.
Note: offloading to NVIDIA GPUs is achieved by using the per-device primary context in LLVM libomptarget CUDA plugin.
It's not a fix for hipMemset, but you might be interested in hsa_amd_memory_fill as an alternative
It's not a fix for hipMemset, but you might be interested in hsa_amd_memory_fill as an alternative
Thank you for the info. I don't really use hipMemset in the application but it makes the reproducer simple without writing a kernel and makefile.
It's not a fix for hipMemset, but you might be interested in hsa_amd_memory_fill as an alternative
Thank you for the info. I don't really use hipMemset in the application but it makes the reproducer simple without writing a kernel and makefile.
What do you actually use in the application? omp_target_alloc returns a pointer that can be used in a kernel - that's what we have in one of our aomp smoke tests (e.g. https://github.com/ROCm-Developer-Tools/aomp/tree/aomp-dev/test/smoke/use_device_ptr)
In the application, I use memory ptr from omp_target_alloc to call hipblas or hip kernel. I was expecting hipMemSet as a convenient routine which ends up calling a kernel.
It looks like working in your fortran case but seems failing in my C++ case.
https://github.com/ye-luo/miniqmc
$ cmake -D CMAKE_CXX_COMPILER=clang++ -D ENABLE_OFFLOAD=1 -D OFFLOAD_TARGET=amdgcn-amd-amdhsa -D OFFLOAD_ARCH=gfx906 -DQMC_ENABLE_ROCM=ON ..
$ make -j32 test_omptarget_memory_interop
$ ctest -R test_omptarget_memory_interop --output-on-failure
Test project /home/yeluo/opt/miniqmc/build_r7_rocmbuild_offload
Start 9: unit_test_omptarget_memory_interop
1/1 Test #9: unit_test_omptarget_memory_interop ...***Failed 1.08 sec
test memory_interop map
test memory_interop vendor device alloc
test memory_interop omp_target_alloc
hipAssert: hipErrorInvalidValue hipErrorInvalidValue, file /home/yeluo/opt/miniqmc/src/Platforms/tests/OMPTarget/test_omp_memory_interop.cpp, line 91
hipMemset failed on omp_target_alloc memory!
The following works for me (not sure if this is guaranteed, so please do take it with a grain of salt) C++ file
#include <stdio.h>
#include <omp.h>
#include "hip_memset.h"
#define N 1000
int main() {
int n = N;
int *a = (int *) omp_target_alloc(n * sizeof(int), omp_get_default_device());
set_mem(a, n);
int err = 0;
for(int i = 0; i < n; i++)
if (a[i] != 0) {
printf("Error at %d: a[%d] = %d\n", i, i, a[i]);
err++;
if (err > 10) break;
}
return err;
}
Header file hip_memset.h:
void set_mem(int *a, int n);
HIP implementation file:
#include <hip/hip_runtime.h>
__global__ void set_kernel(int *a, int n) {
for(int i = 0; i < n; i++)
a[i] = 0;
}
void set_mem(int *a, int n) {
hipError_t err;
set_kernel<<<1, 1, 0>>>(a, n);
hipDeviceSynchronize();
if(err != HIP_SUCCESS)
printf("Error %s!\n", hipGetErrorName(err));
}
Your test passes on my machine. But if I add
err = hipMemset(a, 0, n * sizeof(int));
the return value is hipErrorInvalidValue
That, I was also able to reproduce locally.
My guess is that HIP carries some meta data around a device ptr if it is allocated via HIP. HIP APIs uses such info to do certain optimizations. When a device pointer allocated is from OpenMP, hipMemset doesn't know how to handle the ptr due to missing meta data and thus error out. On the other hand, a HIP kernel only needs a device pointer and bypasses all the meta data checks.
Added failing test to capture this: https://github.com/ROCm-Developer-Tools/aomp/tree/aomp-dev/test/smoke-fails/mix_hipmemset_omp
This is still failing in 16.0-1. I also get a minor linkage issue . /home/grodgers/rocm/aomp/bin/clang++ -O3 -L/home/grodgers/rocm/aomp/lib -lamdhip64 -Wl,-rpath,/home/grodgers/rocm/aomp/lib -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 -D__OFFLOAD_ARCH_gfx906__ memset.cpp hip_memset.o -o memset warning: Linking two modules of different data layouts: '/tmp/hip_memset-4f5c56.o' is '' whereas 'llvm-link' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7'