server icon indicating copy to clipboard operation
server copied to clipboard

MDEV-32363 Shut down Galera networking and logging on fatal signal

Open temeo opened this issue 1 year ago • 1 comments
trafficstars

  • [x] The Jira issue number for this PR is: MDEV-32363

Description

When handling fatal signal, shut down Galera networking before printing out stack trace and writing core file. This is to achieve fail-silent semantics on crashes which may keep the process running for a long time, but not fully responding e.g. due to core dumping or symbol resolving.

Also suppress all Galera/wsrep logging to avoid logging from background threads to garble crash information from signal handler.

Notice that for fully fail-silent crash, Galera 26.4.19 is needed.

How can this PR be tested?

Deterministic test case does not exist. In order to test, start a 3 node cluster and kill one of the nodes by kill -SIGABRT. Observe that the killed node logs

WSREP: Suppressing further logging
WSREP: Shutting down network communications

before printing out stacktrace. Other nodes should report lost connection almost immediately.

Basing the PR against the correct MariaDB version

  • [ ] This is a new feature or a refactoring, and the PR is based against the latest MariaDB development branch.
  • [x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

temeo avatar Jul 30 '24 14:07 temeo

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Jul 30 '24 14:07 CLAassistant

Thanks, the fix has been merged with the head revision: https://github.com/MariaDB/server/commit/54a10a429334a9579558a5d284c510d6f8b5bc97

sysprg avatar Sep 01 '24 13:09 sysprg