server
server copied to clipboard
MDEV-32363 Shut down Galera networking and logging on fatal signal
- [x] The Jira issue number for this PR is: MDEV-32363
Description
When handling fatal signal, shut down Galera networking before printing out stack trace and writing core file. This is to achieve fail-silent semantics on crashes which may keep the process running for a long time, but not fully responding e.g. due to core dumping or symbol resolving.
Also suppress all Galera/wsrep logging to avoid logging from background threads to garble crash information from signal handler.
Notice that for fully fail-silent crash, Galera 26.4.19 is needed.
How can this PR be tested?
Deterministic test case does not exist. In order to test, start a 3 node cluster and kill one of the nodes by kill -SIGABRT. Observe that the killed node logs
WSREP: Suppressing further logging
WSREP: Shutting down network communications
before printing out stacktrace. Other nodes should report lost connection almost immediately.
Basing the PR against the correct MariaDB version
- [ ] This is a new feature or a refactoring, and the PR is based against the latest MariaDB development branch.
- [x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.
PR quality check
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
Thanks, the fix has been merged with the head revision: https://github.com/MariaDB/server/commit/54a10a429334a9579558a5d284c510d6f8b5bc97