Ye Luo
Ye Luo
https://github.com/ye-luo/miniqmc ``` $ cmake -D CMAKE_CXX_COMPILER=clang++ -D ENABLE_OFFLOAD=1 -D OFFLOAD_TARGET=amdgcn-amd-amdhsa -D OFFLOAD_ARCH=gfx906 -DQMC_ENABLE_ROCM=ON .. $ make -j32 test_omptarget_memory_interop $ ctest -R test_omptarget_memory_interop --output-on-failure Test project /home/yeluo/opt/miniqmc/build_r7_rocmbuild_offload Start 9: unit_test_omptarget_memory_interop 1/1...
Your test passes on my machine. But if I add ``` err = hipMemset(a, 0, n * sizeof(int)); ``` the return value is hipErrorInvalidValue
My guess is that HIP carries some meta data around a device ptr if it is allocated via HIP. HIP APIs uses such info to do certain optimizations. When a...
Please also double check if mylib.a is presented twice on the link line, the linking can still succeed.
@gregrodgers Thank you. Fixing the original problem is appreciated. Both use cases are valid. > Please also double check if mylib.a is presented twice on the link line, the linking...
@jsjodin appreciated if you could remove the restriction.
FYI, https://github.com/RadeonOpenCompute/ROCm/issues/887#issuecomment-822222885 I hope once ROCm side enables RDNA, AOMP works out of box. Right now, nailing the software on GFX9 is really critical.
reproducer ``` git clone https://github.com/ye-luo/miniqmc cd miniqmc/build cmake -DCMAKE_CXX_COMPILER=/home/yeluo/rocm/aomp_0.7-0/bin/clang++ \ -DENABLE_OFFLOAD=1 -DOFFLOAD_TARGET=amdgcn-amd-amdhsa \ -DCMAKE_CXX_FLAGS="-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 -v" \ .. make -j15 check_spo_batched ``` src/QMCWaveFunctions/einspline_spo_omp.cpp line 159, 238, 311, 405 have offload...
I set 512MB GPU memory in my laptop BIOS and I can still reproduce the issue with rocm 4.1.0. The call stack depth seems being reduced. ``` (gdb) bt #0...
> That's curious. I thought the per-launch allocations had all been removed, though there is probably still allocation on the first target launch. I'll look into this next week. >...