gemini
gemini copied to clipboard
gemini crash with: "fatal error: non in-use span found with specials bit set"
During 1Tb test gemini crash with the following error and backtrace:
fatal error: non in-use span found with specials bit set
Full backtrace
s.state = 0 fatal error: non in-use span found with specials bit setruntime stack: runtime.throw({0xb11c78?, 0x8?}) /usr/local/go/src/runtime/panic.go:992 +0x71 runtime.markrootSpans(0xb3d5f8?, 0xc000000001?) /usr/local/go/src/runtime/mgcmark.go:368 +0x2ed runtime.markroot(0xc00002d738, 0xa3b, 0x1) /usr/local/go/src/runtime/mgcmark.go:193 +0xf7 runtime.gcDrain(0xc00002d738, 0x2) /usr/local/go/src/runtime/mgcmark.go:1047 +0x39f runtime.gcBgMarkWorker.func2() /usr/local/go/src/runtime/mgc.go:1291 +0x154 runtime.systemstack() /usr/local/go/src/runtime/asm_amd64.s:469 +0x49
goroutine 377 [GC worker (idle)]: runtime.systemstack_switch() /usr/local/go/src/runtime/asm_amd64.s:436 fp=0xc00041e758 sp=0xc00041e750 pc=0x4669a0 runtime.gcBgMarkWorker() /usr/local/go/src/runtime/mgc.go:1263 +0x1b1 fp=0xc00041e7e0 sp=0xc00041e758 pc=0x41bcb1 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00041e7e8 sp=0xc00041e7e0 pc=0x468a81 created by runtime.gcBgMarkStartWorkers /usr/local/go/src/runtime/mgc.go:1131 +0x25
goroutine 1 [semacquire, 175 minutes]: sync.runtime_Semacquire(0xc0000021a0?) /usr/local/go/src/runtime/sema.go:56 +0x25 sync.(*WaitGroup).Wait(0xc0002dbd98?) /usr/local/go/src/sync/waitgroup.go:136 +0x52 golang.org/x/sync/errgroup.(*Group).Wait(0xc000104f40) /home/ls/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:40 +0x27 main.run(0xf5a8a0?, {0xafa1b9?, 0x23?, 0x23?}) /home/ls/repos/gemini_upstream/cmd/gemini/root.go:308 +0x17aa github.com/spf13/cobra.(*Command).execute(0xf5a8a0, {0xc00001e250, 0x23, 0x23}) /home/ls/go/pkg/mod/github.com/spf13/[email protected]/command.go:762 +0x67c github.com/spf13/cobra.(*Command).ExecuteC(0xf5a8a0) /home/ls/go/pkg/mod/github.com/spf13/[email protected]/command.go:852 +0x2dc github.com/spf13/cobra.(*Command).Execute(...) /home/ls/go/pkg/mod/github.com/spf13/[email protected]/command.go:800 main.main() /home/ls/repos/gemini_upstream/cmd/gemini/main.go:29 +0x25
goroutine 22 [select, 295 minutes]: github.com/gocql/gocql.(*eventDebouncer).flusher(0xc0001967d0) /home/ls/go/pkg/mod/github.com/scylladb/[email protected]/events.go:39 +0x65 created by github.com/gocql/gocql.newEventDebouncer /home/ls/go/pkg/mod/github.com/scylladb/[email protected]/events.go:27 +0x11a
goroutine 206 [select]: github.com/gocql/gocql.(*writeCoalescer).writeFlusher(0xc00036d6e0, 0xc000398080?) /home/ls/go/pkg/mod/github.com/scylladb/[email protected]/conn.go:834 +0x138 created by github.com/gocql/gocql.newWriteCoalescer /home/ls/go/pkg/mod/github.com/scylladb/[email protected]/conn.go:739 +0x173
goroutine 20 [IO wait, 295 minutes]: internal/poll.runtime_pollWait(0x7ff130115fc0, 0x72) /usr/local/go/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc0000aa500?, 0xc00002c500?, 0x0) /usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Accept(0xc0000aa500) /usr/local/go/src/internal/poll/fd_unix.go:614 +0x22c net.(*netFD).accept(0xc0000aa500) /usr/local/go/src/net/fd_unix.go:172 +0x35 net.(*TCPListener).accept(0xc0000c84c8) /usr/local/go/src/net/tcpsock_posix.go:139 +0x28 net.(*TCPListener).Accept(0xc0000c84c8) /usr/local/go/src/net/tcpsock.go:288 +0x3d net/http.(*Server).Serve(0xc000204000, {0xbe8580, 0xc0000c84c8}) /usr/local/go/src/net/http/server.go:3039 +0x385 net/http.(*Server).ListenAndServe(0xc000204000) /usr/local/go/src/net/http/server.go:2968 +0x7d net/http.ListenAndServe(...) /usr/local/go/src/net/http/server.go:3222 main.run.func1() /home/ls/repos/gemini_upstream/cmd/gemini/root.go:174 +0x99 created by main.run /home/ls/repos/gemini_upstream/cmd/gemini/root.go:172 +0x285
... rest can be found on the loader set logs
Installation details
Kernel Version: 5.15.0-1026-aws
Scylla version (or git commit hash): 5.2.0~dev-20221201.c5121cf27365
with build-id d25d298da80af2e947d1728190d8f4f26978e44a
Cluster size: 3 nodes (i3.4xlarge)
Scylla Nodes used in this run:
- gemini-1tb-10h-master-oracle-db-node-3ef5ed88-1 (18.207.116.245 | 10.12.11.149) (shards: 14)
- gemini-1tb-10h-master-db-node-3ef5ed88-3 (34.224.101.215 | 10.12.9.216) (shards: 14)
- gemini-1tb-10h-master-db-node-3ef5ed88-2 (3.85.37.40 | 10.12.9.20) (shards: 14)
- gemini-1tb-10h-master-db-node-3ef5ed88-1 (3.94.159.101 | 10.12.11.226) (shards: 14)
OS / Image: ami-058aaa3c56c451303
(aws: us-east-1)
Test: gemini-1tb-10h
Test id: 3ef5ed88-fc39-4489-b6d5-f762b38d65b1
Test name: scylla-master/gemini-/gemini-1tb-10h
Test config file(s):
-
Restore Monitor Stack command:
$ hydra investigate show-monitor 3ef5ed88-fc39-4489-b6d5-f762b38d65b1
-
Restore monitor on AWS instance using Jenkins job
-
Show all stored logs command:
$ hydra investigate show-logs 3ef5ed88-fc39-4489-b6d5-f762b38d65b1
Logs:
- db-cluster-3ef5ed88.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/3ef5ed88-fc39-4489-b6d5-f762b38d65b1/20221202_102421/db-cluster-3ef5ed88.tar.gz
- monitor-set-3ef5ed88.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/3ef5ed88-fc39-4489-b6d5-f762b38d65b1/20221202_102421/monitor-set-3ef5ed88.tar.gz
- loader-set-3ef5ed88.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/3ef5ed88-fc39-4489-b6d5-f762b38d65b1/20221202_102421/loader-set-3ef5ed88.tar.gz
- sct-runner-3ef5ed88.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/3ef5ed88-fc39-4489-b6d5-f762b38d65b1/20221202_102421/sct-runner-3ef5ed88.tar.gz