unified-memory-framework
unified-memory-framework copied to clipboard
*multiThreadedpow2AlignedAlloc/disjoint_w_params* tests fail sporadically
*multiThreadedpow2AlignedAlloc/disjoint_w_params* tests:
mallocPoolTest/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_2_umf_ba_global(test_memoryPool) anddisjointPoolTests/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_0_umf_ba_global(test_disjoint_pool)
fail sporadically in the following way: https://github.com/oneapi-src/unified-memory-framework/actions/runs/16843892970/job/47720079405
[ RUN ] mallocPoolTest/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_2_umf_ba_global
/home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:221: Failure
Expected: (ptr) != (nullptr), actual: NULL vs (nullptr)
or: https://github.com/ldorau/unified-memory-framework/actions/runs/16845396177/job/47724161570
[ RUN ] disjointPoolTests/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_0_umf_ba_global
/home/testuser/test/poolFixtures.hpp:221: Failure
Expected: (ptr) != (nullptr), actual: NULL vs (nullptr)
Environment Information
- UMF version (hash commit or a tag): cc0565d6ca4628c78b9ab16d42e122d610e9c7e2
- OS(es) version(s): Linux
Please provide a reproduction of the bug:
$ while ./test/test_memoryPool --gtest_filter="*multiThreadedpow2AlignedAlloc/disjoint_w_params*" > ./log.txt 2>&1 && ./test/test_disjoint_pool --gtest_filter="*multiThreadedpow2AlignedAllo
c/disjoint_w_params*" > ./log.txt 2>&1 ; do date ; done
How often bug is revealed:
rare
Details
The root cause is pool_register_slab: register failed because the address is already registered!:
[PID:1835396 TID:1835401 ERROR UMF] pool_register_slab: register failed because the address is already registered!
[PID:1835396 TID:1835401 ERROR UMF] bucket_create_slab: slab_reg failed!
More logs:
$ grep -e "ERROR UMF" -e Failure -e 0x7fd07f81e008 ./log.txt
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe493e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835400 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe496e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835400 DEBUG UMF] pool_unregister_slab: slab: 0x7fd07fe493e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835401 ERROR UMF] pool_register_slab: register failed because the address is already registered! (slab: 0x7fd07fe496e8, start: 0x7fd07f81e008)
[PID:1835396 TID:1835401 ERROR UMF] bucket_create_slab: slab_reg failed!
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835399 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835399 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe495e8, start: 0x7fd07f81e008
/home/ldorau/work/unified-memory-framework/test/poolFixtures.hpp:221: Failure
and:
$ grep -e "ERROR UMF" -e Failure -e 0x7f15c647f008 ./log.txt
[PID:772 TID:776 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772 TID:776 DEBUG UMF] pool_register_slab: slab: 0x7f15c64c1468, start: 0x7f15c647f008
[PID:772 TID:776 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772 TID:773 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772 TID:773 DEBUG UMF] pool_register_slab: slab: 0x7f15c64c17e8, start: 0x7f15c647f008
[PID:772 TID:773 ERROR UMF] pool_register_slab: register failed because the address is already registered! (slab: 0x7f15c64c17e8, start: 0x7f15c647f008)
[PID:772 TID:773 ERROR UMF] bucket_create_slab: slab_reg failed!
[PID:772 TID:773 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772 TID:776 DEBUG UMF] pool_unregister_slab: slab: 0x7f15c64c1468, start: 0x7f15c647f008
[PID:772 TID:775 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772 TID:775 DEBUG UMF] pool_register_slab: slab: 0x7f15c64c1668, start: 0x7f15c647f008
/home/ldorau/work/unified-memory-framework/test/poolFixtures.hpp:221: Failure
The culprit is (found by git-bisect):
7930e59d71a3bf21d747c539c310b9825e7ddee0 is the first bad commit
commit 7930e59d71a3bf21d747c539c310b9825e7ddee0
Author: Rafal Rudnicki <[email protected]>
Date: Mon Jul 21 14:26:48 2025 +0000
implement umfPoolTrimMemory
---
bisect found first bad commit
The last failure: https://github.com/ldorau/unified-memory-framework/actions/runs/16902790459/job/47885775331
5 of 6 weekly CI builds failed because of this issue: https://github.com/oneapi-src/unified-memory-framework/actions/runs/16843892970
Next failure: https://github.com/ldorau/unified-memory-framework/actions/runs/16955859410/job/48057824688
It can be connected with: https://github.com/oneapi-src/unified-memory-framework/issues/1492
Next failure: https://github.com/oneapi-src/unified-memory-framework/actions/runs/16962600184/job/48079262217
The culprit is (found by
git-bisect):7930e59d71a3bf21d747c539c310b9825e7ddee0 is the first bad commit commit 7930e59d71a3bf21d747c539c310b9825e7ddee0 Author: Rafal Rudnicki <[email protected]> Date: Mon Jul 21 14:26:48 2025 +0000 implement umfPoolTrimMemory --- bisect found first bad commit
-
CI builds from the last good commit (https://github.com/oneapi-src/unified-memory-framework/commit/b56f6909276566084d11aa3847fa1ce5e39d0698): Weekly: https://github.com/ldorau/unified-memory-framework/actions/runs/18220204171 PR/push: https://github.com/ldorau/unified-memory-framework/actions/runs/18220204229 Nightly: https://github.com/ldorau/unified-memory-framework/actions/runs/18220204201
-
CI builds from the first bad commit (https://github.com/oneapi-src/unified-memory-framework/commit/7930e59d71a3bf21d747c539c310b9825e7ddee0): Weekly: https://github.com/ldorau/unified-memory-framework/actions/runs/18220232952 PR/push: https://github.com/ldorau/unified-memory-framework/actions/runs/18220232917 Nightly: https://github.com/ldorau/unified-memory-framework/actions/runs/18220232918