ArborX icon indicating copy to clipboard operation
ArborX copied to clipboard

v2.0: some tests fails on aarch64

Open junghans opened this issue 8 months ago • 6 comments

from https://koji.fedoraproject.org/koji/taskinfo?taskID=131645432:

 9/14 Test  #9: ArborX_Test_DetailsClusteringHelpers .....***Exception: SegFault  2.18 sec
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  For unit testing set OMP_PROC_BIND=false
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
Fatal glibc error: malloc.c:
�� (__libc_malloc2): assertion failed: !victim || chunk_is_mmapped (mem2chunk (victim)) || ar_ptr == arena_for_chunk (mem2chunk (victim))
Fatal glibc error: malloc.c:
` (__libc_malloc2): assertion failed: !victim || chunk_is_mmapped (mem2chunk (victim)) || ar_ptr == arena_for_chunk (mem2chunk (victim))
Fatal glibc error: malloc.c:3423 (__libc_malloc2): assertion failed: !victim || chunk_is_mmapped (mem2chunk (victim)) || ar_ptr == arena_for_chunk (mem2chunk (victim))
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
      Start 10: ArborX_Test_SpecializedTraversals
10/14 Test #10: ArborX_Test_SpecializedTraversals ........***Failed    0.22 sec
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  For unit testing set OMP_PROC_BIND=false
Running 10 test cases...
/builddir/build/BUILD/ArborX-2.0-build/ArborX-2.0/test/tstNeighborList.cpp(177): [1;31;49merror: in "find_neighbor_list_compare_filtered_tree_traversal<Kokkos__Device<Kokkos__OpenMP_ Kokkos__HostSpace>>": check Test::buildHalfNeighborListAndExpandToFull(exec_space, points, radius) == Test::compute_reference<MemorySpace>(exec_space, points, radius) has failed
  - mismatch at position 30: [( 15 19 24 34 35 36 50 54 55 63 72 91 92 ) == ( 15 19 24 35 63 )] is false
  - mismatch at position 34: [( 19 30 36 54 58 72 91 ) == ( 19 36 54 58 72 91 )] is false
  - mismatch at position 36: [( 0 13 19 24 25 30 34 50 55 81 95 ) == ( 0 13 19 24 25 34 50 55 81 95 )] is false
  - mismatch at position 50: [( 0 19 24 30 35 36 48 55 97 ) == ( 0 19 24 35 36 48 55 97 )] is false
  - mismatch at position 54: [( 19 30 34 91 92 ) == ( 19 34 )] is false
  - mismatch at position 55: [( 0 13 19 24 30 33 36 50 81 97 ) == ( 0 13 19 24 33 36 50 81 97 )] is false
  - mismatch at position 72: [( 0 19 22 30 34 58 59 ) == ( 0 19 22 34 58 59 )] is false
  - mismatch at position 76: [( 4 14 21 27 40 41 63 66 73 82 92 99 ) == ( 4 14 21 27 40 41 63 66 73 82 99 )] is false
  - mismatch at position 91: [( 15 19 24 30 34 54 ) == ( 15 19 24 34 )] is false
  - mismatch at position 92: [( 14 15 30 41 54 74 76 ) == ( 14 15 41 74 )] is false[0;39;49m
[1;31;49m*** 1 failure is detected in the test module "Master Test Suite"
[0;39;49m
...
86% tests passed, 2 tests failed out of 14
Total Test time (real) =   6.66 sec
The following tests FAILED:
	  9 - ArborX_Test_DetailsClusteringHelpers (SEGFAULT)
	 10 - ArborX_Test_SpecializedTraversals (Failed)
Errors while running CTest

junghans avatar Apr 17 '25 22:04 junghans

On Fedora 42, I am getting:

2025-04-18T17:17:18.8335607Z  8/14 Test  #8: ArborX_Test_Clustering ...................***Failed    0.05 sec
2025-04-18T17:17:18.8336125Z Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
2025-04-18T17:17:18.8336630Z   In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
2025-04-18T17:17:18.8337264Z   For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
2025-04-18T17:17:18.8337562Z   For unit testing set OMP_PROC_BIND=false
2025-04-18T17:17:18.8337721Z 
2025-04-18T17:17:18.8337789Z Running 10 test cases...
2025-04-18T17:17:18.8339429Z /__w/ArborX.spec/ArborX.spec/ArborX-2.0-build/ArborX-2.0/test/tstDBSCAN.cpp(175): [1;31;49merror: in "DBSCAN/dbscan<Kokkos__Device<Kokkos__OpenMP_ Kokkos__HostSpace>>": check verifyDBSCAN(space, points, r - (Coordinate)0.1, 2, dbscan(space, points, r - (Coordinate)0.1, 2, params)) has failed[0;39;49m
2025-04-18T17:17:18.8340458Z 
2025-04-18T17:17:18.8340707Z [1;31;49m*** 1 failure is detected in the test module "Master Test Suite"
2025-04-18T17:17:18.8341006Z [0;39;49m
2025-04-18T17:17:18.8341086Z

(see https://github.com/junghans/ArborX.spec/actions/runs/14536418986/job/40785602994), you can use https://github.com/junghans/ArborX.spec/blob/rawhide/.github/workflows/continuous-integration-workflow.yml to reproduce.

junghans avatar Apr 18 '25 17:04 junghans

Theoretically, should be reproducible with the following steps:

docker run --privileged -it registry.fedoraproject.org/fedora:latest bash
# Now inside the docker
dnf -y install fedpkg
git clone https://github.com/junghans/ArborX.spec
dnf -y builddep ArborX.spec
spectool -g ArborX.spec
fedpkg srpm
mock -r fedora-rawhide-ppc64le --forcearch ppc64le --init
mock -r fedora-rawhide-ppc64le --forcearch ppc64le --no-clean ArborX-2.0-1.fc43.src.rpm
. /etc/profile.d/modules.sh
module avail
fedpkg local

Currently, mock -r fedora-rawhide-ppc64le --forcearch ppc64le --init fails with

[157/164] Installing python-srpm-macros 100% |   1.4 MiB/s |  53.0 KiB |  00m00s
[158/164] Installing util-linux-0:2.40. 100% |  66.8 MiB/s |   6.7 MiB |  00m00s
>>> Running post-install scriptlet: util-linux-0:2.40.4-7.fc43.ppc64le          
>>> Non-critical error in post-install scriptlet: util-linux-0:2.40.4-7.fc43.ppc
>>> [RPM] %post(util-linux-2.40.4-7.fc43.ppc64le) scriptlet failed, exit status
[159/164] Installing which-0:2.23-1.fc4 100% |   5.1 MiB/s | 125.5 KiB |  00m00s
[160/164] Installing shadow-utils-2:4.1 100% |  37.8 MiB/s |   5.0 MiB |  00m00s
Error: call to ldconfig failed.
[161/164] Installing info-0:7.2-3.fc42. 100% | 560.3 KiB/s | 485.8 KiB |  00m01s
>>> Running trigger-install scriptlet: glibc-common-0:2.41.9000-10.fc43.ppc64le
>>> Non-critical error in trigger-install scriptlet: glibc-common-0:2.41.9000-10
>>> [RPM] lua script failed: [string "%transfiletriggerin(glibc-common-2.41.9000
>>> Running trigger-install scriptlet: info-0:7.2-3.fc42.ppc64le                
>>> Non-critical error in trigger-install scriptlet: info-0:7.2-3.fc42.ppc64le  
>>> [RPM] %transfiletriggerin(info-7.2-3.fc42.ppc64le) scriptlet failed, exit st
Transaction failed: Rpm transaction failed.

aprokop avatar Apr 19 '25 22:04 aprokop

There is a bug in cross-compiling mode of mock (https://github.com/rpm-software-management/mock/issues/1570), but for aarch64, I can reproduce it natively using GitHub's arm runner with https://github.com/junghans/ArborX.spec/blob/rawhide/.github/workflows/continuous-integration-workflow.yml

junghans avatar Apr 20 '25 00:04 junghans

I can reproduce it natively using GitHub's arm runner with junghans/ArborX.spec@rawhide/.github/workflows/continuous-integration-workflow.yml

How can I use it for debugging? I need something I can run locally to test. Or, at least, put a workflow against ArborX repo.

aprokop avatar Apr 20 '25 16:04 aprokop

I can make that into a workflow give me a day.

junghans avatar Apr 20 '25 18:04 junghans

See #1244

junghans avatar Apr 22 '25 19:04 junghans