[Bazel CI] library tests are failing with Bazel@HEAD
CI: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/3841#018f5b83-19fe-48b2-b301-77acb6e1c285
Platform: Ubuntu
Logs:
FAILED: //tests/library-empty:library-empty (Summary)
FAILED: //tests/library-with-static-cc-dep:library-with-static-cc-dep-dynamic (Summary)
Culprit: https://github.com/bazelbuild/bazel/commit/2482322fedb463e0bbecd29c5e8e6d0f087ed884
CC Greenteam @mai93 @Wyverald
CC @aherrmann @mboes
The error message is a bit puzzling:
==================== Test output for //tests/indirect-link:indirect-link-dynamic:
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2aebe93b1a8f9ac29a2a4c83872246a4/sandbox/linux-sandbox/1928/execroot/rules_haskell_tests/bazel-out/k8-fastbuild/bin/tests/indirect-link/indirect-link-dynamic.runfiles/rules_haskell_tests/tests/indirect-link/indirect-link-dynamic: error while loading shared libraries: libHSrts-1.0.2_thr-ghc9.4.6.so: cannot open shared object file: No such file or directory
================================================================================
==================== Test output for //tests/indirect-link:indirect-link-dynamic:
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2aebe93b1a8f9ac29a2a4c83872246a4/sandbox/linux-sandbox/1939/execroot/rules_haskell_tests/bazel-out/k8-fastbuild/bin/tests/indirect-link/indirect-link-dynamic.runfiles/rules_haskell_tests/tests/indirect-link/indirect-link-dynamic: error while loading shared libraries: libHSrts-1.0.2_thr-ghc9.4.6.so: cannot open shared object file: No such file or directory
================================================================================
==================== Test output for //tests/indirect-link:indirect-link-dynamic:
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2aebe93b1a8f9ac29a2a4c83872246a4/sandbox/linux-sandbox/1955/execroot/rules_haskell_tests/bazel-out/k8-fastbuild/bin/tests/indirect-link/indirect-link-dynamic.runfiles/rules_haskell_tests/tests/indirect-link/indirect-link-dynamic: error while loading shared libraries: libHSrts-1.0.2_thr-ghc9.4.6.so: cannot open shared object file: No such file or directory
================================================================================
IIUC the effective change of https://github.com/bazelbuild/bazel/commit/2482322fedb463e0bbecd29c5e8e6d0f087ed884 is to turn
-lstdc++ -lm
...
-Wl,-no-as-needed -no-as-needed
into
-Wl,--push-state,-as-needed -lstdc++ -Wl,--pop-state
-Wl,--push-state,-as-needed -lm -Wl,--pop-state
...
-Wl,-no-as-needed
@avdv is this something you could look into?
I tried to reproduce it inside of a Docker container (gcr.io/bazel-public/ubuntu1804-java11:latest) but could not:
# export USE_BAZEL_VERSION=last_green
# bazel --version
2024/05/15 06:58:59 Using unreleased version at commit 88a230f4cf28deec1455cb2caed4dc9f81e108c9
2024/05/15 06:58:59 Downloading https://storage.googleapis.com/bazel-builds/artifacts/centos7/88a230f4cf28deec1455cb2caed4dc9f81e108c9/bazel...
Downloading: 70 MB out of 70 MB (100%)
bazel no_version
# git show
git show
commit cbf57268dc222a5867fe2a578f0eed06875405ee (HEAD -> check_head_bazel)
Merge: 5e8a6bc2 b525e7bc
Author: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Date: Mon May 6 10:58:07 2024 +0000
Merge pull request #2183 from tweag/cb/fix-default-ghc-snapshot
rules_haskell_tests: Fix ghcide stack pointing to wrong snapshot file
# bazel build --show_progress_rate_limit=5 --curses=yes --color=yes --terminal_columns=143 --show_timestamps --verbose_failures --jobs=30 --announce_rc --experimental_repository_cache_hardlinks --disk_cache= --sandbox_tmpfs_path=/tmp --config=ci-common --config=linux-bindist --build_tag_filters=-requires_nix,-requires_lz4,-requires_shellcheck,-requires_threaded_rts,-dont_test_with_bindist,-dont_test_on_bazelci,-integration --test_env=HOME --test_env=BAZELISK_USER_AGENT --test_env=USE_BAZEL_VERSION --lockfile_mode=off -- //tests/...
Extracting Bazel installation...
...
(08:08:31) INFO: Build completed successfully, 798 total actions
# bazel test --flaky_test_attempts=3 --build_tests_only --local_test_jobs=12 --show_progress_rate_limit=5 --curses=yes --color=yes --terminal_columns=143 --show_timestamps --verbose_failures --jobs=30 --announce_rc --experimental_repository_cache_hardlinks --disk_cache= --sandbox_tmpfs_path=/tmp --experimental_build_event_json_file_path_conversion=false --config=ci-common --config=linux-bindist --test_tag_filters=-requires_nix,-requires_lz4,-requires_shellcheck,-requires_threaded_rts,-dont_test_with_bindist,-dont_test_on_bazelci,-integration --test_env=HOME --test_env=BAZELISK_USER_AGENT --test_env=USE_BAZEL_VERSION --lockfile_mode=off -- //tests/...
...
Executed 138 out of 138 tests: 138 tests pass.
Culprit: https://github.com/bazelbuild/bazel/commit/2482322fedb463e0bbecd29c5e8e6d0f087ed884
@sgowroji Why do you think this is the problem here? I would expect to see some linker errors if the standard C++ lib / libm is missing, but we actually see this:
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2aebe93b1a8f9ac29a2a4c83872246a4/sandbox/linux-sandbox/1744/execroot/rules_haskell_tests/bazel-out/k8-fastbuild/bin/tests/binary-with-lib-dynamic/binary-with-lib-dynamic.runfiles/rules_haskell_tests/tests/binary-with-lib-dynamic/binary-with-lib-dynamic: error while loading shared libraries: libHSrts-1.0.2_thr-ghc9.4.6.so: cannot open shared object file: No such file or directory
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2aebe93b1a8f9ac29a2a4c83872246a4/sandbox/linux-sandbox/1773/execroot/rules_haskell_tests/bazel-out/k8-fastbuild/bin/tests/library-empty/library-empty.runfiles/rules_haskell_tests/tests/library-empty/library-empty: error while loading shared libraries: libHSrts-1.0.2_thr-ghc9.4.6.so: cannot open shared object file: No such file or directory
This file should be available in the rules_haskell_ghc_linux_amd64 repository:
# ldd bazel-ci-bin/tests/library-empty/library-empty
linux-vdso.so.1 (0x00007ffe9b0a6000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f742006c000)
libHSbase-4.17.2.0-ghc9.4.6.so => /rules_haskell/rules_haskell_tests/bazel-ci-bin/tests/library-empty/../../_solib_k8/external_Srules_Uhaskell_Ughc_Ulinux_Uamd64/libHSbase-4.17.2.0-ghc9.4.6.so (0x00007f741f5dd000)
libHSghc-bignum-1.3-ghc9.4.6.so => /rules_haskell/rules_haskell_tests/bazel-ci-bin/tests/library-empty/../../_solib_k8/external_Srules_Uhaskell_Ughc_Ulinux_Uamd64/libHSghc-bignum-1.3-ghc9.4.6.so (0x00007f74205d4000)
libHSghc-prim-0.9.1-ghc9.4.6.so => /rules_haskell/rules_haskell_tests/bazel-ci-bin/tests/library-empty/../../_solib_k8/external_Srules_Uhaskell_Ughc_Ulinux_Uamd64/libHSghc-prim-0.9.1-ghc9.4.6.so (0x00007f741f0e2000)
libHSrts-1.0.2_thr-ghc9.4.6.so => /root/.cache/bazel/_bazel_root/16f1ee034ef5b66efb3b6cb7da500a21/execroot/rules_haskell_tests/external/rules_haskell_ghc_linux_amd64/bin/../lib/lib/../lib/x86_64-linux-ghc-9.4.6/libHSrts-1.0.2_thr-ghc9.4.6.so (0x00007f7420529000)
libffi.so.8 => /rules_haskell/rules_haskell_tests/bazel-ci-bin/tests/library-empty/../../_solib_k8/external_Srules_Uhaskell_Ughc_Ulinux_Uamd64/libffi.so.8 (0x00007f741eed4000)
libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f741ec53000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f741e862000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f741e65a000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f741e456000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f741e237000)
/lib64/ld-linux-x86-64.so.2 (0x00007f742040a000)
Culprit: bazelbuild/bazel@2482322
@sgowroji Why do you think this is the problem here?
Just FYI: the culprit commit is identified by an automatic bisect tool, and is just the first commit that started failing. So it's not always accurate (i.e. not necessarily the commit that caused the current failure).
I debugged this here: https://buildkite.com/bazel/rules-haskell-haskell/builds/2005
Basically it works on ubuntu2004, but fails on ubuntu1804.
The binaries for the //tests/library-empty/library-empty target seem to be mostly identical:
# ubuntu 1804
$ readelf -d bazel-ci-bin/tests/library-empty/library-empty
Dynamic section at offset 0x1cc0 contains 38 entries:
Tag Type Name/Value
0x0000000000000003 (PLTGOT) 0x2fd0
0x0000000000000002 (PLTRELSZ) 72 (bytes)
0x0000000000000017 (JMPREL) 0xc18
0x0000000000000014 (PLTREL) RELA
0x0000000000000007 (RELA) 0x948
0x0000000000000008 (RELASZ) 720 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffff9 (RELACOUNT) 10
0x0000000000000015 (DEBUG) 0x0
0x0000000000000006 (SYMTAB) 0x298
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000005 (STRTAB) 0x4f0
0x000000000000000a (STRSZ) 964 (bytes)
0x000000006ffffef5 (GNU_HASH) 0x8b8
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libHSbase-4.17.2.0-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libHSghc-bignum-1.3-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libHSghc-prim-0.9.1-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libHSrts-1.0.2_thr-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libffi.so.8]
0x0000000000000001 (NEEDED) Shared library: [libgmp.so.10]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x000000000000000c (INIT) 0xc60
0x000000000000000d (FINI) 0x1014
0x000000000000001a (FINI_ARRAY) 0x2cb0
0x000000000000001c (FINI_ARRAYSZ) 8 (bytes)
0x0000000000000019 (INIT_ARRAY) 0x2cb8
0x000000000000001b (INIT_ARRAYSZ) 8 (bytes)
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/../../_solib_k8/external_Srules_Uhaskell_Ughc_Ulinux_Uamd64:/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2efebdb1cb14e111ec858014b0a163bd/execroot/rules_haskell_tests/external/rules_haskell_ghc_linux_amd64/bin/../lib/lib/../lib/x86_64-linux-ghc-9.4.6]
0x000000000000001e (FLAGS) BIND_NOW
0x000000006ffffffb (FLAGS_1) Flags: NOW
0x000000006ffffff0 (VERSYM) 0x8f0
0x000000006ffffffe (VERNEED) 0x924
0x000000006fffffff (VERNEEDNUM) 1
0x0000000000000000 (NULL) 0x0
# ubuntu 2004
$ readelf -d bazel-ci-bin/tests/library-empty/library-empty
Dynamic section at offset 0x1cc0 contains 38 entries:
Tag Type Name/Value
0x0000000000000003 (PLTGOT) 0x2fd0
0x0000000000000002 (PLTRELSZ) 72 (bytes)
0x0000000000000017 (JMPREL) 0xbc8
0x0000000000000014 (PLTREL) RELA
0x0000000000000007 (RELA) 0x8f8
0x0000000000000008 (RELASZ) 720 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffff9 (RELACOUNT) 10
0x0000000000000015 (DEBUG) 0x0
0x0000000000000006 (SYMTAB) 0x298
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000005 (STRTAB) 0x4c0
0x000000000000000a (STRSZ) 952 (bytes)
0x000000006ffffef5 (GNU_HASH) 0x878
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libHSbase-4.17.2.0-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libHSghc-bignum-1.3-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libHSghc-prim-0.9.1-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libHSrts-1.0.2_thr-ghc9.4.6.so]
0x0000000000000001 (NEEDED) Shared library: [libffi.so.8]
0x0000000000000001 (NEEDED) Shared library: [libgmp.so.10]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x000000000000000c (INIT) 0xc10
0x000000000000000d (FINI) 0xfb8
0x000000000000001a (FINI_ARRAY) 0x2cb0
0x000000000000001c (FINI_ARRAYSZ) 8 (bytes)
0x0000000000000019 (INIT_ARRAY) 0x2cb8
0x000000000000001b (INIT_ARRAYSZ) 8 (bytes)
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/../../_solib_k8/external_Srules_Uhaskell_Ughc_Ulinux_Uamd64:/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/b29c7eb0e49d1926dfb75bf93b64c07e/execroot/rules_haskell_tests/external/rules_haskell_ghc_linux_amd64/bin/../lib/lib/../lib/x86_64-linux-ghc-9.4.6]
0x000000000000001e (FLAGS) BIND_NOW
0x000000006ffffffb (FLAGS_1) Flags: NOW
0x000000006ffffff0 (VERSYM) 0x8a8
0x000000006ffffffe (VERNEED) 0x8d8
0x000000006fffffff (VERNEEDNUM) 1
0x0000000000000000 (NULL) 0x0
But the RUNPATH entries are different. Lo and behold, the path from the first run does not exist:
$ ls -lh $( readelf -d bazel-ci-bin/tests/library-empty/library-empty | sed -ne '/RUNPATH/s,.*[$]ORIGIN.*:\([^]]*\)\].*,\1,p' )
ls: cannot access '/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/2efebdb1cb14e111ec858014b0a163bd/execroot/rules_haskell_tests/external/rules_haskell_ghc_linux_amd64/bin/../lib/lib/../lib/x86_64-linux-ghc-9.4.6': No such file or directory
So that looks like a caching issue, right?
Hi @avdv, Any update on the above issue?
Hi @avdv, Any update on the above issue?
I have created a PR that should fix the problem: https://github.com/tweag/rules_haskell/pull/2202
Seems like today's run also failed:
error while loading shared libraries: libffi.so.8: cannot open shared object file: No such file or directory
I pushed another fix, that hopefully takes care of that.
\edit: It's green again :tada: