scylla-bench panic: runtime error: index out of range [2] with length 0

trafficstars

Issue description

[ ] This issue is a regression.
[x] It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

It is unclear what a user does wrong or can do with it.

How frequently does it reproduce?

First time observed.

Installation details

Kernel Version: 5.15.0-1028-aws Scylla version (or git commit hash): 2022.2.0-20230112.4f0f82ff2e1d with build-id 6bb071079708f1e48d1faeee6ef3cc3ef93733af

Cluster size: 4 nodes (is4gen.4xlarge)

Scylla Nodes used in this run:

longevity-large-partitions-4d-vp-br-db-node-b4fd1150-8 (54.246.26.242 | 10.4.0.15) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-7 (3.251.80.191 | 10.4.2.60) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-6 (3.253.32.157 | 10.4.0.237) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-5 (3.253.42.60 | 10.4.1.171) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-4 (34.243.215.150 | 10.4.1.50) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-3 (54.73.239.104 | 10.4.1.3) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-2 (52.51.73.98 | 10.4.3.254) (shards: 15)
longevity-large-partitions-4d-vp-br-db-node-b4fd1150-1 (34.240.149.141 | 10.4.2.223) (shards: 15)

OS / Image: ami-0a018bb5789118823 (aws: eu-west-1)

Test: longevity-large-partition-4days-arm-test Test id: b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9 Test name: scylla-5.2/longevity/longevity-large-partition-4days-arm-test Test config file(s):

longevity-large-partition-4days.yaml

Details:

Running large partitions test where we have huge latencies appeared following error:

panic: runtime error: index out of range [2] with length 0

goroutine 1 [running]:
github.com/HdrHistogram/hdrhistogram-go.(*Histogram).getCountAtIndex(...)
	/go/pkg/mod/github.com/!hdr!histogram/[email protected]/hdr.go:595
github.com/HdrHistogram/hdrhistogram-go.(*iterator).nextCountAtIdx(0x10000203073?, 0x7ff01ad9d748?)
	/go/pkg/mod/github.com/!hdr!histogram/[email protected]/hdr.go:662 +0xc5
github.com/HdrHistogram/hdrhistogram-go.(*iterator).next(0xc0000ef020)
	/go/pkg/mod/github.com/!hdr!histogram/[email protected]/hdr.go:670 +0x25
github.com/HdrHistogram/hdrhistogram-go.(*rIterator).next(...)
	/go/pkg/mod/github.com/!hdr!histogram/[email protected]/hdr.go:683
github.com/HdrHistogram/hdrhistogram-go.(*Histogram).Merge(0xf0000000e?, 0x4000000000a?)
	/go/pkg/mod/github.com/!hdr!histogram/[email protected]/hdr.go:177 +0x8d
github.com/scylladb/scylla-bench/pkg/results.(*MergedResult).AddResult(0xc1cfb01bc0, {0x0, 0x0, 0x1, 0x7d0, 0x0, {0x0, 0x0, 0x0}, 0xc2f5d83300, ...})
	/go/scylla-bench-0.1.16/pkg/results/merged_result.go:54 +0x1b5
github.com/scylladb/scylla-bench/pkg/results.(*TestResults).GetResultsFromThreadsAndMerge(0xc000162840)
	/go/scylla-bench-0.1.16/pkg/results/thread_results.go:60 +0x89
github.com/scylladb/scylla-bench/pkg/results.(*TestResults).GetTotalResults(0xc000162840)
	/go/scylla-bench-0.1.16/pkg/results/thread_results.go:82 +0xcc
main.main()
	/go/scylla-bench-0.1.16/main.go:640 +0x3b1e

Probably appears having a too big loader load like in the following bug: https://github.com/scylladb/scylla-bench/issues/121

Monitoring:

Restore Monitor Stack command: $ hydra investigate show-monitor b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9
Restore monitor on AWS instance using Jenkins job
Show all stored logs command: $ hydra investigate show-logs b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9

Logs:

https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/sct-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/db-cluster-b4fd1150.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/loader-set-b4fd1150.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/monitor-set-b4fd1150.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/summary-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/raw_events-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/events-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/normal-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/debug-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/warning-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/error-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/critical-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/output-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/argus-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/left_processes-b4fd1150.log.tar.gz
https://cloudius-jenkins-test.s3.amazonaws.com/b4fd1150-5888-4a5c-b35a-6f0e3fdbdad9/20230322_105913/email_data-b4fd1150.json.tar.gz

Jenkins job URL

Mar 22 '23 14:03 vponomaryov

Probably appears having a too big loader load like in the following bug: https://github.com/scylladb/scylla-bench/issues/121

you mean we need a bigger loader ? I would rather collect some of the stats/graphs to show it's actually the case, before picking up bigger loader nodes

Mar 22 '23 14:03 fruch

Probably appears having a too big loader load like in the following bug: #121

you mean we need a bigger loader ? I would rather collect some of the stats/graphs to show it's actually the case, before picking up bigger loader nodes

In good case the s-b bugs must be fixed. I just described the observation that the bigger instance types reduce probability of such s-b crashes.

Mar 22 '23 15:03 vponomaryov

scylla-bench scylla-bench copied to clipboard

panic: runtime error: index out of range [2] with length 0

Issue description

Impact

How frequently does it reproduce?

Installation details

Details:

Monitoring:

Logs:

scylla-bench
scylla-bench copied to clipboard