scylla-cluster-tests
scylla-cluster-tests copied to clipboard
dockerd crashed on loader during gemini run
Issue description
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime: gp: gp=0xc0006f4300, goid=0, gp->atomicstatus=0
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime: g: g=0xc000001380, goid=0, g->atomicstatus=0
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: fatal error: bad g->status in ready
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime stack:
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.throw(0x55cf7aa9c446, 0x16)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/panic.go:774 +0x74
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.ready(0xc0006f4300, 0x4, 0xc000015e01)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:659 +0x2bd
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.goready.func1()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:315 +0x3a
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.systemstack(0x0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/asm_amd64.s:370 +0x63
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.mstart()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:1146
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: goroutine 36 [running]:
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.systemstack_switch()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/asm_amd64.s:330 fp=0xc000015e08 sp=0xc000015e00 pc=0x55cf79201140
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.goready(0xc0006f4300, 0x4)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:314 +0x5e fp=0xc000015e38 sp=0xc000015e08 pc=0x55cf791d461e
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.send(0xc000094720, 0xc000094000, 0xc000015f38, 0xc000015ec8, 0x3)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/chan.go:299 +0x7e fp=0xc000015e68 sp=0xc000015e38 pc=0x55cf791a7fce
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.chansend(0xc000094720, 0xc000015f38, 0x63885c00, 0x55cf79269b0e, 0x6a76552ec12)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/chan.go:193 +0x51e fp=0xc000015ee8 sp=0xc000015e68 pc=0x55cf791a7e2e
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.selectnbsend(0xc000094720, 0xc000015f38, 0x55cf7cceb740)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/chan.go:615 +0x46 fp=0xc000015f20 sp=0xc000015ee8 pc=0x55cf791a8d56
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: time.sendTime(0x55cf7b622960, 0xc000094720, 0x0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/time/sleep.go:137 +0x6e fp=0xc000015f60 sp=0xc000015f20 pc=0x55cf79269b0e
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.timerproc(0x55cf7cceffe0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/time.go:297 +0x72 fp=0xc000015fd8 sp=0xc000015f60 pc=0x55cf791f1af2
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.goexit()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc000015fe0 sp=0xc000015fd8 pc=0x55cf79203241
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: created by runtime.(*timersBucket).addtimerLocked
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/time.go:169 +0x110
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: goroutine 1 [chan receive, 120 minutes]:
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: main.(*DaemonCli).start(0xc0007b4240, 0xc0000958c0, 0x0, 0x0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/daemon.go:253 +0xc03
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: main.runDaemon(...)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/docker_unix.go:13
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 systemd[1]: docker.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: main.newDaemonCommand.func1(0xc000730f00, 0xc0007b41e0, 0x0, 0x3, 0x0, 0x0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/docker.go:34 +0x7c
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).execute(0xc000730f00, 0xc0000d4010, 0x3, 0x3, 0xc000730f00, 0xc0000d4010)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:762 +0x462
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000730f00, 0x0, 0x0, 0x10)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).Execute(...)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:800
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: main.main()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/cmd/dockerd/docker.go:97 +0x191
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: goroutine 19 [syscall, 91 minutes]:
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: os/signal.signal_recv(0x55cf7bb413a0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/sigqueue.go:147 +0x9e
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: os/signal.loop()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/os/signal/signal_unix.go:23 +0x24
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: created by os/signal.init.0
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/os/signal/signal_unix.go:29 +0x43
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: goroutine 0 [idle]:
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: fatal error: unexpected signal during runtime execution
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: panic during panic
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x55cf791f5e7a]
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime stack:
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.throw(0x55cf7aad6c78, 0x2a)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/panic.go:774 +0x74
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.sigpanic()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/signal_unix.go:378 +0x480
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc0006f4000, 0x0, 0x0, 0x64, 0x0, 0x0, 0x0, ...)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/traceback.go:159 +0x15a
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.traceback1(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc0006f4000, 0x0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/traceback.go:722 +0xf2
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.traceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc0006f4000)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/traceback.go:676 +0x54
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 systemd[1]: Unit docker.service entered failed state.
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.tracebackothers(0xc000001380)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/traceback.go:929 +0x1ac
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.dopanic_m(0xc000001380, 0x55cf791d27f4, 0x7f10903ebcb0, 0x1)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/panic.go:974 +0x2a4
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.fatalthrow.func1()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/panic.go:829 +0x61
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.fatalthrow()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/panic.go:826 +0x59
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.throw(0x55cf7aa9c446, 0x16)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/panic.go:774 +0x74
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.ready(0xc0006f4300, 0x4, 0xc000015e01)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:659 +0x2bd
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.goready.func1()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:315 +0x3a
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.systemstack(0x0)
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/asm_amd64.s:370 +0x63
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: runtime.mstart()
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 dockerd[1366]: /usr/local/go/src/runtime/proc.go:1146
Dec 01 07:50:26 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 systemd[1]: docker.service failed.
Dec 01 07:50:28 gemini-with-nemesis-3h-normal-5-1-loader-node-9046ca31-1 systemd[1]: docker.service holdoff time over, scheduling restart.
Installation details
Kernel Version: 5.15.0-1026-aws
Scylla version (or git commit hash): 5.1.0-20221201.fde4a6e92d83 with build-id 3f51aa5a5121e5f42755a4cc669ae3dfc2e3b2dd
Relocatable Package: http://downloads.scylladb.com/unstable/scylla/branch-5.1/relocatable/2022-12-01T02:29:23Z/scylla-x86_64-package.tar.gz
Cluster size: 3 nodes (i3.large)
Scylla Nodes used in this run:
- gemini-with-nemesis-3h-normal-5-1-oracle-db-node-9046ca31-1 (35.171.133.90 | 10.12.3.208) (shards: 2)
- gemini-with-nemesis-3h-normal-5-1-db-node-9046ca31-3 (3.234.250.228 | 10.12.3.218) (shards: 2)
- gemini-with-nemesis-3h-normal-5-1-db-node-9046ca31-2 (44.195.37.130 | 10.12.2.75) (shards: 2)
- gemini-with-nemesis-3h-normal-5-1-db-node-9046ca31-1 (3.222.185.166 | 10.12.0.78) (shards: 2)
OS / Image: ami-00c21f4517ae026a4 (aws: us-east-1)
Test: gemini-3h-with-nemesis-test
Test id: 9046ca31-2d20-4cdc-9c16-993a8561b282
Test name: scylla-5.1/gemini-/gemini-3h-with-nemesis-test
Test config file(s):
-
Restore Monitor Stack command:
$ hydra investigate show-monitor 9046ca31-2d20-4cdc-9c16-993a8561b282 -
Restore monitor on AWS instance using Jenkins job
-
Show all stored logs command:
$ hydra investigate show-logs 9046ca31-2d20-4cdc-9c16-993a8561b282
Logs:
- db-cluster-9046ca31.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/9046ca31-2d20-4cdc-9c16-993a8561b282/20221201_081015/db-cluster-9046ca31.tar.gz
- monitor-set-9046ca31.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/9046ca31-2d20-4cdc-9c16-993a8561b282/20221201_081015/monitor-set-9046ca31.tar.gz
- loader-set-9046ca31.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/9046ca31-2d20-4cdc-9c16-993a8561b282/20221201_081015/loader-set-9046ca31.tar.gz
- sct-runner-9046ca31.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/9046ca31-2d20-4cdc-9c16-993a8561b282/20221201_081015/sct-runner-9046ca31.tar.gz
- parallel-timelines-report-9046ca31.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/9046ca31-2d20-4cdc-9c16-993a8561b282/20221201_081015/parallel-timelines-report-9046ca31.tar.gz
- probably need to back port the new versions of loader AMIs, that has specific docker version in them.
- also it might be advised to move to Ubuntu based images (instead of that old centos7 ones), all across to get a bit more stable distro
We will need to adjust all places we install things on the loader. We can maybe progress to CentOS8 more easily if you think it's relevant to this issue.
We will need to adjust all places we install things on the loader. We can maybe progress to CentOS8 more easily if you think it's relevant to this issue.
why not move to some more stable distro ? (for same reason we did for scylla images ? also for docker based loader, we do care just for docker to be installed, nothing else.
Just to avoid catching all places we assumed it's centos. But if it should be simple with docker backend it should be easier. Only thing left is what about future performance branches.
Just to avoid catching all places we assumed it's centos. But if it should be simple with docker backend it should be easier. Only thing left is what about future performance branches.
I don't think we have any such assumption for the loader, and it would be easy to flush out.
The perf branches can "calibrated" with Ubuntu+docker, or keep the old CentOS7 images
Anyhow lets wait for more incidents of this, before deciding we should more in any of the directions
Anyhow lets wait for more incidents of this, before deciding we should more in any of the directions
The only downside of waiting, is that is might affect next release (5.2), so far it's two week we have it in, and we only saw this once.
Another reproduction in run:
Installation details
Kernel Version: 5.15.0-1028-aws
Scylla version (or git commit hash): 5.3.0~dev-20230131.5d914adcef1f with build-id c0fd94703025292798832fd91f1b88ffe64025d7
Cluster size: 3 nodes (i3.large)
Scylla Nodes used in this run:
- gemini-with-nemesis-3h-normal-updat-oracle-db-node-08d92363-1 (3.91.2.110 | 10.12.0.253) (shards: 2)
- gemini-with-nemesis-3h-normal-updat-db-node-08d92363-3 (44.200.62.116 | 10.12.2.214) (shards: 2)
- gemini-with-nemesis-3h-normal-updat-db-node-08d92363-2 (3.215.134.17 | 10.12.2.161) (shards: 2)
- gemini-with-nemesis-3h-normal-updat-db-node-08d92363-1 (44.192.62.79 | 10.12.3.219) (shards: 2)
OS / Image: ami-07a90f071421efaed (aws: us-east-1)
Test: gemini-3h-with-nemesis-test
Test id: 08d92363-4120-4119-9410-ecfcc25d4739
Test name: scylla-staging/lukasz/gemini-3h-with-nemesis-test
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor 08d92363-4120-4119-9410-ecfcc25d4739 - Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs 08d92363-4120-4119-9410-ecfcc25d4739
Logs:
- db-cluster-08d92363.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/08d92363-4120-4119-9410-ecfcc25d4739/20230202_123655/db-cluster-08d92363.tar.gz
- sct-runner-08d92363.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/08d92363-4120-4119-9410-ecfcc25d4739/20230202_123655/sct-runner-08d92363.tar.gz
- monitor-set-08d92363.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/08d92363-4120-4119-9410-ecfcc25d4739/20230202_123655/monitor-set-08d92363.tar.gz
- loader-set-08d92363.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/08d92363-4120-4119-9410-ecfcc25d4739/20230202_123655/loader-set-08d92363.tar.gz
- parallel-timelines-report-08d92363.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/08d92363-4120-4119-9410-ecfcc25d4739/20230202_123655/parallel-timelines-report-08d92363.tar.gz
I'm guessing kind of related:
docker service, coredumped:
Mar 27 22:50:25 gemini-with-nemesis-3h-normal-maste-loader-node-59828a1f-1 systemd-coredump[24575]: Failed to create coredump file /var/lib/systemd/coredump/.#core.dockerd.0.f407a0f9fb9244698afc85d58475c463.1364.167995742400000010c8b3daaeed3160: No such file or directory
Mar 27 22:50:25 gemini-with-nemesis-3h-normal-maste-loader-node-59828a1f-1 systemd-coredump[24575]: Process 1364 (dockerd) of user 0 dumped core.
Mar 27 22:50:25 gemini-with-nemesis-3h-normal-maste-loader-node-59828a1f-1 systemd[1]: docker.service: main process exited, code=killed, status=11/SEGV
we should change the loader AMIs to newer distro, and newer docker versions.
Issue description
- [ ] This issue is a regression.
- [ ] It is unknown if this issue is a regression.
Describe your issue in detail and steps it took to produce it.
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Kernel Version: 5.15.0-1031-aws
Scylla version (or git commit hash): 5.3.0~dev-20230325.e8fb718e4ad4 with build-id 6eed28a1ac2addc02aceea60af4d6ee4acd56955
Cluster size: 3 nodes (i3.large)
Scylla Nodes used in this run:
- gemini-with-nemesis-3h-normal-maste-oracle-db-node-59828a1f-1 (3.252.123.174 | 10.4.1.102) (shards: 2)
- gemini-with-nemesis-3h-normal-maste-db-node-59828a1f-3 (18.203.237.229 | 10.4.3.144) (shards: 2)
- gemini-with-nemesis-3h-normal-maste-db-node-59828a1f-2 (176.34.80.33 | 10.4.0.46) (shards: 2)
- gemini-with-nemesis-3h-normal-maste-db-node-59828a1f-1 (3.250.31.15 | 10.4.3.27) (shards: 2)
OS / Image: ami-04226bde2b30a3d2d (aws: eu-west-1)
Test: gemini-3h-with-nemesis-test
Test id: 59828a1f-6c16-47cd-a63b-9ceb43a0c844
Test name: scylla-master/gemini-/gemini-3h-with-nemesis-test
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor 59828a1f-6c16-47cd-a63b-9ceb43a0c844 - Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs 59828a1f-6c16-47cd-a63b-9ceb43a0c844
Logs:
- db-cluster-59828a1f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/59828a1f-6c16-47cd-a63b-9ceb43a0c844/20230327_232636/db-cluster-59828a1f.tar.gz
- sct-runner-events-59828a1f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/59828a1f-6c16-47cd-a63b-9ceb43a0c844/20230327_232636/sct-runner-events-59828a1f.tar.gz
- sct-59828a1f.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/59828a1f-6c16-47cd-a63b-9ceb43a0c844/20230327_232636/sct-59828a1f.log.tar.gz
- monitor-set-59828a1f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/59828a1f-6c16-47cd-a63b-9ceb43a0c844/20230327_232636/monitor-set-59828a1f.tar.gz
- loader-set-59828a1f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/59828a1f-6c16-47cd-a63b-9ceb43a0c844/20230327_232636/loader-set-59828a1f.tar.gz
- parallel-timelines-report-59828a1f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/59828a1f-6c16-47cd-a63b-9ceb43a0c844/20230327_232636/parallel-timelines-report-59828a1f.tar.gz
This issue is stale because it has been open 2 years with no activity. Remove stale label or comment or this will be closed in 2 days.
This issue was closed because it has been stalled for 2 days with no activity.