aomp
aomp copied to clipboard
host-side memory leak
Below a pure-C minimal failing example showing an increase in memory consumption when multiple omp-offloading shared objects are called back to back from Python
https://github.com/devitocodes/devito/tree/patch-omp-off-leakage/tests/omp-mfe
the MFE files are hosted on a devito branch, but the MFE is completely independent of devito
reproduced with:
- Rocm 5.4.1 aompcc 14.0
- Rocm 4.5.2 aompcc 13
hypothesis: openmp runtime keeping around pinned memory buffers
run as per README.md at link
Now with a pure-C reproducer (no python involved)
I'm doing plain dlopen / dlclose
https://github.com/FabioLuporini/hpc-bugs/tree/main/omp-off-leak/c
unable to access this link: https://github.com/FabioLuporini/hpc-bugs/tree/main/omp-off-leak/c
Sorry, I renamed the folders at some point.
Here's the working link: https://github.com/FabioLuporini/hpc-bugs/tree/main/amdgpu.clang-amd/omp-off-leak/c
@Lynd98 could you grab this testcase and valgrind it
15.0-3:
==3417== definitely lost: 16,112 bytes in 8 blocks
==3417== indirectly lost: 163 bytes in 3 blocks
==3417== possibly lost: 84,860 bytes in 240 blocks
==3417== still reachable: 948,396 bytes in 2,866 blocks
==3417== of which reachable via heuristic:
==3417== multipleinheritance: 272 bytes in 3 blocks
==3417== suppressed: 0 bytes in 0 blocks
16.0-0
==3089== LEAK SUMMARY:
==3089== definitely lost: 344 bytes in 5 blocks
==3089== indirectly lost: 163 bytes in 3 blocks
==3089== possibly lost: 84,860 bytes in 240 blocks
==3089== still reachable: 952,270 bytes in 2,971 blocks
==3089== of which reachable via heuristic:
==3089== multipleinheritance: 272 bytes in 3 blocks
==3089== suppressed: 0 bytes in 0 blocks
@estewart08 is the fix in ROCm v5.2.3 or in any of the docker images here https://hub.docker.com/r/rocm/dev-ubuntu-20.04/tags ?
No, the fix is in AOMP 16.0-0 and will be in ROCm 5.4.
excellent, thanks!
any ETA on the release (ballpark OK -- weeks / months?)
Can we check if this is working in 16.0-0. Or wait till 16.0-1 comes out later this week and recheck.
Hi Greg, I talked to @yaomingamd who told me that the aompcc
wrapper is broken in v5.3, at least the one deployed on your docker hub, which we depend on: https://hub.docker.com/r/rocm/dev-ubuntu-20.04/tags
I've been advised to rather use amdclang
, is that how we should proceed? I'll see if I can start a build later today
the script is fixed in upcoming 5.4 release. the change is fairly straightforward update your copy from here: https://github.com/ROCm-Developer-Tools/aomp-extras/blob/aomp-dev/utils/bin/aompcc
to move to clang/clang++/amdclang/amdclang++
explicitly add -v to your aompcc and you can observe what options it added for your
typically:
-target $HOST_TARGET -fopenmp -fopenmp-targets=$TARGET_TRIPLE -Xopenmp-target=$TARGET_TRIPLE -march=$AOMP_GPU