backtrace-rs icon indicating copy to clipboard operation
backtrace-rs copied to clipboard

Backtrace through signal handlers on alpine

Open r1viollet opened this issue 10 months ago • 13 comments
trafficstars

Description

While using the backtrace::resolve_frame_unsynchronized API it seems we are not going through signal handlers. Here is what I get when using the API,

Starting backtrace-rs unwinding...
Frame: IP: 0x58951a886bc6, Function: backtrace::backtrace::libunwind::trace::hd9a0af93696308ae, File: "/root/.cargo/git/checkouts/backtrace-rs-fb1f822361417489/f8cc6ac/src/backtrace/libunwind.rs", Line: 116
Frame: IP: 0x58951a886bc6, Function: backtrace::backtrace::trace_unsynchronized::hac1bf9acb19bced3, File: "/root/.cargo/git/checkouts/backtrace-rs-fb1f822361417489/f8cc6ac/src/backtrace/mod.rs", Line: 66
Frame: IP: 0x58951a887eb6, Function: unwind_example::unwind_with_backtrace::h8ed6b6267026e92f, File: "/opt/libdatadog/examples/rust/src/main.rs", Line: 101
Frame: IP: 0x58951a888034, Function: unwind_example::crash_handler::hd3592dcebb8db214, File: "/opt/libdatadog/examples/rust/src/main.rs", Line: 145
Frame: IP: 0x754eafdcf5a4, Function: sigwaitinfo, File: "/home/buildozer/aports/main/musl/src/musl-1.2.5/src/signal/x86_64/restore.s", Line: 1

As you can see I get stuck on sigwaitinfo and to not go up to the main function. I wrote a small comparison to libunwind where I successfully unwind through the signal handler and get to the main function.

The issue could be within the miri API. I did not dive into what was done within this function.

Reproducer

Here is the source code for my example. I ran this on an alpine image where I made an install of libunwind. It might be difficult to run. Please reach out if you want more precise instructions.

LD_PRELOAD=/usr/lib/libgcc_s.so ./target/debug/unwind_example

Thanks for considering 🙇

r1viollet avatar Jan 24 '25 16:01 r1viollet

Please describe exactly how you built the code. Exactly which rustc, installed from where, using what build configuration?

workingjubilee avatar Jan 27 '25 02:01 workingjubilee

Here is a dockerfile using a recent alpine.

ARG BASE_IMAGE="alpine:3.21.2"
FROM ${BASE_IMAGE} AS base

RUN apk update \
  && apk add --no-cache \
    build-base \
    cargo \
    cmake \
    curl \
    git \
    make \
    patchelf \
    protoc \
    pkgconf \
    unzip \
    bash \
  && mkdir /usr/local/src

# Install libunwind
RUN apk add automake autoconf libtool
RUN wget https://github.com/libunwind/libunwind/releases/download/v1.8.1/libunwind-1.8.1.tar.gz \
  && tar -xvf libunwind-1.8.1.tar.gz \
  && cd libunwind-1.8.1 \
  && autoreconf -i ./ \
  && ./configure CFLAGS="-g -O3" --disable-tests && make -j 8 \
make -j 8 \
  && make install

Then you can copy the example I mentioned above, build it. (nothing special here)

cargo build --target-dir target-alpine

And then I had to run it preloading libgcc_s. Somehow the build script was not linking libgcc_s.

LD_PRELOAD=/usr/lib/libgcc_s.so ./target-alpine/debug/unwind_example

If I understand well, both use libunwind. What version / distribution of libunwind is packaged into backtrace-rs?

r1viollet avatar Jan 27 '25 08:01 r1viollet

This was also visible on older alpine versions, using 0.3.74. The example uses 0.3.75, as can be seen in the example cargo files:

backtrace = { git = "https://github.com/rust-lang/backtrace-rs", tag = "0.3.75" }

r1viollet avatar Jan 27 '25 08:01 r1viollet

I wonder if the difference could be induced by the fact that libunwind is pulled from ubuntu. Whereas I recompile libunwind from alpine.

r1viollet avatar Jan 27 '25 09:01 r1viollet

What does ldd target-alpine/debug/unwind_example show? Does it link against libunwind or libgcc_s? The former should be used when producing a statically linked executable while the latter should be used when producing a dynamically linked executable, but I don't know how exactly Alpine patches rustc. Just that they patch it to do dynamic linking by default for the musl targets.

bjorn3 avatar Jan 27 '25 18:01 bjorn3

So I had to split the examples as I had mixed C unwinding with libunwind along with backtrace-rs unwinding. New examples are here.

With the new setup, we have separate binaries that have backtrace unwinding on one side:

cargo build --bin unwind_backtrace --target-dir ./target-alpine-full --verbose
ldd ./target-alpine-full/debug/unwind_backtrace
	/lib/ld-musl-x86_64.so.1 (0x7d778c4fd000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x7d778c361000)
	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7d778c4fd000)

libunwind is statically linked, but I still get a dynamic link to libgcc.

And I compile the C unwinding by dynamically linking to libunwind (I had to adjust the build script in the example above, un-commenting the lines that do the linking to libunwind):

cargo build --bin unwind_c --target-dir ./target-alpine-full --verbose
ldd ./target-alpine-full/debug/unwind_c
	/lib/ld-musl-x86_64.so.1 (0x71b106053000)
	libunwind.so.8 => /usr/local/lib/libunwind.so.8 (0x71b105fdc000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x71b105fb0000)
	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x71b106053000)

As mentioned previously, using libunwind manually, I'm able to get to the main from the crash (even if the symbols are not great)

./target-alpine-full/debug/unwind_c
Running unwind_c example...
Crash detected! Unwinding stack...
Function: _ZN8unwind_c13crash_handler13crash_handler17hdf05f74edd9802c0E+0x5e
Function: <unknown>
Function: _ZN3std2rt10lang_start17h6d13f624c8f892faE+0x3a
Function: main+0x1e
Function: <unknown>
Function: <unknown>
Function: <unknown>
Function: <unknown>

r1viollet avatar Jan 28 '25 09:01 r1viollet

Why are you trying to link against libunwind manually when rustc already links your program against libgcc_s? libunwind and libgcc_s are both exporting the same _Unwind_* symbols. There is no guarantee that the dynamic linker will not mix symbols provided by both shared libraries, which would cause UB.

bjorn3 avatar Jan 28 '25 10:01 bjorn3

So I think I confused you with my setup.

  • unwind_c: I do things manually to compare the unwinding, I link manually (and I do not need the backtrace-rs dependency).
  • unwind_backtrace: I do not do anything manually. I just pull the backtrace-rs dependency. This is the example that shows we do not unwind until the main. And I think the setup makes sense.

r1viollet avatar Jan 28 '25 10:01 r1viollet

At https://github.com/DataDog/libdatadog/commit/474b67b7727b47a1a8a704ee55c4754fbc2bc150#diff-ffb20d4b2b21cbbfd14872f5b864628343f71b89b55817525c08fad31ea3ae13R28-R29 you seem to try linking against both libunwind and libgcc_s. However I just saw that at https://github.com/DataDog/libdatadog/commit/474b67b7727b47a1a8a704ee55c4754fbc2bc150#diff-ffb20d4b2b21cbbfd14872f5b864628343f71b89b55817525c08fad31ea3ae13R21 you are reading CARGO_BIN_NAME which is not set of build scripts afaik. As such I don't get how you are linking to libunwind. The backtrace crate doesn't instruct rustc to link against libunwind either. The decision to link libgcc_s or libunwind is made by the standard library, which should only pick one or the other.

bjorn3 avatar Jan 28 '25 10:01 bjorn3

So the unwind_c was essentially to try and investigate the issue. You can remove the build.rs (and everything else that relates to the C reproducer) to focus on the example with backtrace-rs. I was counting on cargo to do the build magic (setting CARGO_BIN_NAME) however that did not happen, so I forced it manually. Here is my build command:

CARGO_BIN_NAME="unwind_c" cargo build --bin unwind_c --target-dir ./target-alpine-full --verbose

And for the backtrace example, we do not need the linker logics to apply:

cargo build --bin unwind_backtrace --target-dir ./target-alpine-full --verbose

libunwind requires libgcc_s from what I noticed building the C example.

r1viollet avatar Jan 28 '25 13:01 r1viollet

I think this issue explains the unwinding failure and has a good example using std functions.

r1viollet avatar Jan 29 '25 09:01 r1viollet

So the summary of the issue is that:

  • backtrace-rs does not explicitly use libunwind APIs (hence we do not have access to the step API). It would be hard to adjust the unwinding behaviour from backtrace-rs.
  • No CFI (call frame information) is provided to get through the signal handler on musl. Which causes the failure. The issue mentioned above is relevant to this.
  • Using libunwind to manually unwind, we can get through the signal handler. There is a cool article here about this. It mentions the Frame Pointer fallback. sidenote: colleagues mentioned that with libunwind, with the step function we can also use the context to force the unwinding from the signal's context.

Does all of this sound reasonable ?

r1viollet avatar Jan 29 '25 13:01 r1viollet

Using the context from the signal handler is indeed the correct solution. I get correct unwinding (obviously without the signal frames).

USE_CONTEXT=1 ./target-alpine-full/debug/unwind_c

I'll leave the example if this is of interest. I do not know if you would be interested in exposing backtrace capabilities from within signals.

r1viollet avatar Jan 30 '25 09:01 r1viollet