rr icon indicating copy to clipboard operation
rr copied to clipboard

Unable to record pipewire/ff interaction due to shared memory use

Open hlieberman opened this issue 1 year ago • 2 comments

Hello rr team!

I've written back and forth with a couple people more experienced than I about this issue (CC: @khuey), and they suggested that I open an issue here to see if one of you have a better idea of what steps we could take to debug this issue.

Currently, as part of trying to debug a crash in Firefox, I've been attempting to capture an rr trace, however because of shared memory usage inside pipewire (or between pipewire and FF?), the resulting replays are invalid and fail a rr -a.

I've tried running pipewire (and pipewire-pulse) in the same rr session as firefox by having it execute a shell script which forks them into the background prior to running FF[1], but that too fails.

The full log is attached, but the relevant snippet is:

[ERROR ./src/ReplaySession.cc:789:guard_overshoot()] Replay diverged; target registers mismatched:
[FATAL ./src/ReplaySession.cc:793:guard_overshoot()]
 (task 11716 (rec:6979) at time 104963)
 -> Assertion `false' failed to hold. overshot target ticks=43302536 by 309

[snip]

=== Start rr backtrace:
rr(_ZN2rr13dump_rr_stackEv+0x41)[0x5595af5b9311]
rr(_ZN2rr15notifying_abortEv+0xe)[0x5595af5b936e]
rr(_ZN2rr12FatalOstreamD1Ev+0x4f)[0x5595af49bdff]
rr(_ZN2rr21EmergencyDebugOstreamD2Ev+0x499)[0x5595af49c2d9]
rr(+0x17125d)[0x5595af53825d]
rr(_ZN2rr13ReplaySession20emulate_async_signalEPNS_10ReplayTaskERKNS0_15StepConstraintsElNS_15remote_code_ptrE+0x62d)[0x5595af53e04d]
rr(_ZN2rr13ReplaySession18try_one_trace_stepEPNS_10ReplayTaskERKNS0_15StepConstraintsE+0x1dc)[0x5595af53f68c]
rr(_ZN2rr13ReplaySession11replay_stepERKNS0_15StepConstraintsE+0x19e)[0x5595af53fa7e]
rr(+0x16b56e)[0x5595af53256e]
rr(_ZN2rr13ReplayCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x69d)[0x5595af533a7d]
rr(main+0x1af)[0x5595af42748f]
/lib/x86_64-linux-gnu/libc.so.6(+0x2920a)[0x7f32f4e2920a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x7c)[0x7f32f4e292bc]
rr(_start+0x21)[0x5595af4275e1]
=== End rr backtrace

rr.log

Any suggestions for how we could capture this info?

[1]: This, essentially, boils down to pipewire &; pipewire-pulse &; firefox

hlieberman avatar Sep 19 '22 23:09 hlieberman

The previous issue was pipewire (which was outside the recording) sharing memory with Firefox (which was inside the recording).

When you say it fails do you mean that you can successfully record what you care about but then replay doesn't work or do you mean that recording disturbs the program behavior and things fail in a different way?

khuey avatar Sep 19 '22 23:09 khuey

The capture succeeds, but cannot be replayed. The log above is from a run with: rr record --disable-cpuid-features-ext 0xdc230000,0x2c42,0xc -- ./foo.sh, where the script was:

#!/bin/bash

pipewire &
pipewire-pulse &
~/.firefox/firefox-bin --profile /tmp/ff_profile

Note, this is true even if all that happens is starting firefox and then immediately closing it once it has completely opened.

hlieberman avatar Sep 19 '22 23:09 hlieberman

If you do rr replay -a without writing the results to a log file you'll get instructions for starting the emergency debugger. Try following those instructions and then doing "where" in the resulting gdb session.

rocallahan avatar Sep 23 '22 22:09 rocallahan