dd-trace-py
dd-trace-py copied to clipboard
ci: display backtrace from core dumps
We make it easier to debug segmentation faults and other crashes in CI that can generate a core dump. We rely on gdb to display the complete backtraces available from any generated core dumps during test runs in Circle CI.
Checklist
- [ ] Change(s) are motivated and described in the PR description
- [ ] Testing strategy is described if automated tests are not included in the PR
- [ ] Risks are described (performance impact, potential for breakage, maintainability)
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Library release note guidelines are followed or label
changelog/no-changelog
is set - [ ] Documentation is included (in-code, generated user docs, public corp docs)
- [ ] Backport labels are set (if applicable)
- [ ] If this PR changes the public interface, I've notified
@DataDog/apm-tees
. - [ ] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance
.
Reviewer Checklist
- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking API changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the release branch maintenance policy
Quick question: does ulimit -c
work across our CI runners? It can be tricky to guarantee the generation of corefiles (indeed, that they even land in a consistent place). If you feel pretty good about this, please ignore the next part--I'm trying to be helpful, but I am ignorant.
I have an alternative workflow I've been using lately with some success. It has several disadvantages over corefile analysis--it doesn't use a debugger, so you can't interactively inspect anything. I also think its backtrace capabilities are a little less powerful than gdb. It does have a few nice features though
- Prints the backtrace, mappings, and registers to stderr
- Can be instrumented to trigger not just no SIGSEGV, but also SIGABRT (nice for catching those nasty libc errors like double-free)
- Is straightforward to translate to any glibc-based system (for customer incidences)
- Prints immediately, without having to wait for other forks or children to terminate (usually not a problem in CI)
I describe this workflow in a recent issue. It's annoying because the glibc maintainers have stopped providing this utility, but I have a suggestion for how to get it from old .deb archives. No pressure to use it or anything--I love GDB and I think the proposal will work great barring anything unlucky with how the CI runners work. Just throwing out an idea for consideration.
Quick question: does
ulimit -c
work across our CI runners? It can be tricky to guarantee the generation of corefiles (indeed, that they even land in a consistent place). If you feel pretty good about this, please ignore the next part--I'm trying to be helpful, but I am ignorant.
Yep I have already tested this in #7659 (in fact this is where the code comes from). More on this from the Circle CI docs.