cirrus-ci-docs
cirrus-ci-docs copied to clipboard
Recent container change broke ThreadSanitizer builds
Expected Behavior
C++ builds using ThreadSanitizer should complete correctly.
Real Behavior
ThreadSanitizer reports the following error when trying to run any binary:
FATAL: ThreadSanitizer: unexpected memory mapping 0x5bb456972000-0x5bb456973000
Related Info
This is a (tick one of the following):
- [ ] Website issue
- Link to page:
- [x] Task issue
- OS: Docker
- Task name: https://cirrus-ci.com/task/5051526650003456
The log for the task above shows the configure script failing because it thinks that the OpenSSL headers and library differ. Manual investigation using terminal mode shows CMake failing for the reason above. This failure just started recently (in the last few weeks). It doesn't happen with docker containers started with the same Dockerfile on other systems. It's only happening to us on the Cirrus infra. It appears familiar to https://github.com/golang/go/issues/59418, which was caused by a kernel issue (fixed in https://go-review.googlesource.com/c/build/+/482195).
Yeah, Cirrus CI is using Container-Optimized OS version 105 for the x86 and Arm containers. You can put experimental: true flag for the task that is failing. This way it will temporary run on the old infrastructure.
Let's see if the next version of Container-Optimized OS will fix the issues.
Yeah, Cirrus CI is using Container-Optimized OS version 105 for the x86 and Arm containers. You can put experimental: true flag for the task that is failing. This way it will temporary run on the old infrastructure.
Same result with the experimental tag. uname -a on that build says this:
Linux cirrus-ci-task-6181902449639424 5.15.120+ #1 SMP Fri Jul 21 03:39:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Is that correct?
Here's the task configuration:
tsan_sanitizer_task:
experimental: true
container:
# Just uses a recent/common distro to run memory error/leak checks.
dockerfile: ci/ubuntu-22.04/Dockerfile
<< : *SANITIZERS_RESOURCE_TEMPLATE
<< : *CI_TEMPLATE
<< : *SKIP_TASK_ON_PR
env:
ZEEK_CI_CONFIGURE_FLAGS: *TSAN_SANITIZER_CONFIG
ZEEK_CI_DISABLE_SCRIPT_PROFILING: 1
# If this is defined directly in the environment, configure fails to find
# OpenSSL. Instead we define it with a different name and then give it
# the correct name in the testing scripts.
ZEEK_TSAN_OPTIONS: suppressions=/zeek/ci/tsan_suppressions.txt
I tried with the experimental tag in the container block too but that failed the same way. https://cirrus-ci.com/task/5715854843707392 has the last failure.
Could you please try privileged: true for your container instance then? This way a dedicated VM will be used for running your task. It will be a bit slower to schedule but you'll have an Ubuntu.
privileged: true in the outer task block and with experimental removed?
tsan_sanitizer_task:
privileged: true
container:
# Just uses a recent/common distro to run memory error/leak checks.
dockerfile: ci/ubuntu-22.04/Dockerfile
<< : *SANITIZERS_RESOURCE_TEMPLATE
<< : *CI_TEMPLATE
<< : *SKIP_TASK_ON_PR
env:
ZEEK_CI_CONFIGURE_FLAGS: *TSAN_SANITIZER_CONFIG
ZEEK_CI_DISABLE_SCRIPT_PROFILING: 1
# If this is defined directly in the environment, configure fails to find
# OpenSSL. Instead we define it with a different name and then give it
# the correct name in the testing scripts.
ZEEK_TSAN_OPTIONS: suppressions=/zeek/ci/tsan_suppressions.txt
That gets me through the configure step, but the build fails for the same reason when it tries to run a binary as part of the build:
[ 15%] [BIFCL] Processing /zeek/auxil/zeek-af_packet-plugin/src/af_packet.bif
FATAL: ThreadSanitizer: unexpected memory mapping 0x5ad5e496e000-0x5ad5e4973000
https://cirrus-ci.com/task/5992372387971072?logs=build#L967
That gets me through the configure step, but the build fails for the same reason when it tries to run a binary as part of the build:
I re-ran the build this morning to double-check something and it failed during configure again.
If it's still fails with ThreadSanitizer than it might not be an issue with cos 105 version. I found another old report of a similar issue https://github.com/google/sanitizers/issues/806 where the problem was in the old version of gcc.
If you have an x86 host with docker you might try to reproduce the issue using gcr.io/cirrus-ci-community/zeek/zeek/ci/ubuntu-2204/dockerfile:dae6979fc92dcba631e38ce7cf2335a7 container that is used in CI.
I found another old report of a similar issue https://github.com/google/sanitizers/issues/806 where the problem was in the old version of gcc.
I've tried it with both gcc 11 (ubuntu 22) and 12 (ubuntu 23), so I don't think that's it.
If you have an x86 host with docker you might try to reproduce the issue using gcr.io/cirrus-ci-community/zeek/zeek/ci/ubuntu-2204/dockerfile:dae6979fc92dcba631e38ce7cf2335a7 container that is used in CI.
I'll see if I can scrounge up an old system to test it with.
We are also running into this. It should be trivial to reproduce with: echo 'void main(void){}' | gcc -pie -fPIE -fsanitize=thread -xc - -ltsan && ./a.out:
FATAL: ThreadSanitizer: unexpected memory mapping 0x56ce963d3000-0x56ce963d4000
Exit status: 66
See https://cirrus-ci.com/task/6173534590861312?logs=test#L2
Using gcc-13 from Ubuntu 23.10 (beta).
I understand that this is likely possible to fix by using a full GCE VM, but it would be nice if tsan in containers was supported again on Cirrus CI, like before.
I checked for https://github.com/google/sanitizers/issues/877#issuecomment-343644727 but that didn't seem to be the cause here either.
I just wanted to check in and note that this is still broken.
As a temporary workaround, I think clang-18 from Ubuntu Noble 24.04 may work, instead of gcc.
We ended up moving to Ubuntu 24 as well, which resolved our problems.