scylla-machine-image
scylla-machine-image copied to clipboard
Delay coredump.service shutdown after scylla.service shutdown
Issue description
- [ ] This issue is a regression.
- [ ] It is unknown if this issue is a regression.
Recently in a test when executing a soft reboot node, scylla had an error 'aborting on shard'. It didn't create coredump due
!ERR | systemd-coredump[8354]: Failed to connect to coredump service: Connection refused
It looks like the reboot triggers shutdown of coredump service before waiting for scylla service to stop - can we make it to stop after scylla is down?
Impact
No coredump - harder issues investigation
Installation details
Kernel Version: 5.15.0-1028-aws
Scylla version (or git commit hash): 5.2.0~rc1-20230207.8ff4717fd010 with build-id 78fbb2c25e9244a62f57988313388a0260084528
Cluster size: 3 nodes (i4i.large)
Scylla Nodes used in this run:
- longevity-5gb-1h-SoftRebootNodeMonk-db-node-249f30ed-3 (3.252.166.103 | 10.4.3.176) (shards: 2)
- longevity-5gb-1h-SoftRebootNodeMonk-db-node-249f30ed-2 (34.245.61.44 | 10.4.1.56) (shards: 2)
- longevity-5gb-1h-SoftRebootNodeMonk-db-node-249f30ed-1 (34.240.130.155 | 10.4.1.211) (shards: 2)
OS / Image: ami-05e1d6aa4f71f3f25 (aws: eu-west-1)
Test: longevity-5gb-1h-SoftRebootNodeMonkey-aws-test
Test id: 249f30ed-7007-4b8e-a320-1207ebca5e5d
Test name: scylla-5.2/nemesis/longevity-5gb-1h-SoftRebootNodeMonkey-aws-test
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor 249f30ed-7007-4b8e-a320-1207ebca5e5d - Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs 249f30ed-7007-4b8e-a320-1207ebca5e5d
Logs:
- db-cluster-249f30ed.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/249f30ed-7007-4b8e-a320-1207ebca5e5d/20230214_215652/db-cluster-249f30ed.tar.gz
- sct-runner-249f30ed.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/249f30ed-7007-4b8e-a320-1207ebca5e5d/20230214_215652/sct-runner-249f30ed.tar.gz
- monitor-set-249f30ed.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/249f30ed-7007-4b8e-a320-1207ebca5e5d/20230214_215652/monitor-set-249f30ed.tar.gz
- loader-set-249f30ed.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/249f30ed-7007-4b8e-a320-1207ebca5e5d/20230214_215652/loader-set-249f30ed.tar.gz
- parallel-timelines-report-249f30ed.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/249f30ed-7007-4b8e-a320-1207ebca5e5d/20230214_215652/parallel-timelines-report-249f30ed.tar.gz
Is this different than https://github.com/scylladb/scylla-enterprise/issues/2648 which is solved via https://github.com/scylladb/scylladb/pull/12757 ?
Is this different than https://github.com/scylladb/scylla-enterprise/issues/2648 which is solved via https://github.com/scylladb/scylladb/pull/12757 ?
No, this case is a coredump happening during shutdown of a node after systemd-coredump.socket is closed
Luckily for us this coredump happened in more cases.
I finally able to reproduce this after patching scylla to delay shutdown and cause SIGSEGV:
diff --git a/main.cc b/main.cc
index 35c4c25caa..1d167e3152 100644
--- a/main.cc
+++ b/main.cc
@@ -103,6 +103,8 @@
#include <boost/algorithm/string/join.hpp>
+#include <signal.h>
+
namespace fs = std::filesystem;
seastar::metrics::metric_groups app_metrics;
@@ -471,6 +473,8 @@ static auto defer_verbose_shutdown(const char* what, Func&& func) {
startlog.info("Shutting down {}", what);
try {
func();
+ seastar::sleep(std::chrono::minutes(1)).get();
+ raise(SIGSEGV);
startlog.info("Shutting down {} was successful", what);
} catch (...) {
auto ex = std::current_exception();
So I tried to delay coredump.service shutdown after scylla-server.service, by following drop-in conf:
$ cat /etc/systemd/system/scylla-server.service.d/dependencies.conf
[Unit]
After=local-fs.target network-online.target systemd-coredump.socket var-lib-systemd-coredump.mount
Requires=local-fs.target network-online.target systemd-coredump.socket var-lib-systemd-coredump.mount
Also, I found that there are GH issue which says [email protected] may get terminate when shutdown (https://github.com/systemd/systemd/issues/7176), so I also added a workaround for this:
$ cat /etc/systemd/system/[email protected]/killsignal.conf
[Service]
KillSignal=SIGCONT
I thought now we can capture coredump correctly on systemd-coredump, but it's not. It still cause error on systemd-coredump, even systemd-coredump.socket is shutdown after scylla-server.service:
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: Segmentation fault on shard 0.
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: Backtrace:
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x56b8e88
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x56ed416
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: /opt/scylladb/libreloc/libc.so.6+0x3cb1f
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: /opt/scylladb/libreloc/libc.so.6+0x8ce5b
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: /opt/scylladb/libreloc/libc.so.6+0x3ca75
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x13c1bb0
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x13c21b3
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x127f256
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x1272375
Jul 01 00:27:41 ubuntu-jammy scylla[1054]: 0x59440f6
Jul 01 00:27:41 ubuntu-jammy systemd-coredump[2271]: Failed to send coredump fd: Broken pipe
Jul 01 00:28:07 ubuntu-jammy systemd[1]: scylla-server.service: Main process exited, code=dumped, status=11/SEGV
Jul 01 00:28:07 ubuntu-jammy systemd[1]: scylla-server.service: Failed with result 'core-dump'.
Jul 01 00:28:07 ubuntu-jammy systemd[1]: Stopped Scylla Server.
Jul 01 00:28:07 ubuntu-jammy systemd[1]: scylla-server.service: Consumed 14min 45.846s CPU time.
...
Jul 01 00:28:07 ubuntu-jammy systemd[1]: systemd-coredump.socket: Deactivated successfully.
Jul 01 00:28:07 ubuntu-jammy systemd[1]: Closed Process Core Dump Socket.
I tried again and again with bit different configuration, but systemd-coredump never worked during shutdown. So I decided only possible workaround is stop using systemd-coredump handler on kernel.core_pattern and set filepath on kernel.core_pattern directly. I will send a workaround for that.
@avikivity Do you have any idea with this issue?
Opened issue on systemd GH https://github.com/systemd/systemd/issues/28338
@avikivity ping, do you have any idea?
Sorry for missing the issue. I often skip over scylla-machine-image because I don't maintain it. I'll look over it now.
I guess the problem is that, even with the dependency, systemd thinks the process is done (not sure why - the PID still exists while dumping code) so it stops systemd-coredumpd while the code dump is in progress. Very funky.
@avikivity since we decided to not merging workaround, what else can we do for this? Maybe we should document it?