sarama
sarama copied to clipboard
BalanceStrategySticky panic with concurrent map writes error
Versions
Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Sarama | Kafka | Go |
---|---|---|
v1.37.2 | 2.2.0 | 1.18 |
Configuration
s.config = sarama.NewConfig()
s.config.Version = sarama.V2_2_0_0
s.config.ChannelBufferSize = 10000
s.config.ClientID = "server"
if s.tlsConfig != nil {
s.config.Net.TLS.Enable = true
s.config.Net.TLS.Config = s.tlsConfig
}
s.config.Net.SASL.User = s.username
s.config.Net.SASL.Password = s.password
s.config.Metadata.RefreshFrequency = time.Second * 30
s.config.Consumer.Retry.BackoffFunc = func(retries int) time.Duration {
if retries >= 3 {
return time.Second * 3
}
return time.Second
}
s.config.Consumer.Fetch.Default = 1024 * 1024
s.config.Consumer.MaxWaitTime = 100 * time.Millisecond
s.config.Consumer.Return.Errors = false
s.config.Consumer.Group.Rebalance.GroupStrategies = []sarama.BalanceStrategy{sarama.BalanceStrategySticky}
s.config.Consumer.Offsets.AutoCommit.Enable = s.autoCommit
s.config.Consumer.Offsets.Retention = 24 * time.Hour
s.config.Consumer.Group.Session.Timeout = 15 * time.Second
s.config.Consumer.Group.Heartbeat.Interval = 5 * time.Second
Logs
When filing an issue please provide logs from Sarama and Kafka if at all
possible. You can set sarama.Logger
to a log.Logger
to capture Sarama debug
output.
logs: CLICK ME
fatal error: concurrent map writes
goroutine 291 [running]:
runtime.throw({0x36fb26e?, 0xc000000d28?})
/src/runtime/panic.go:992 +0x76 fp=0xc0025d37f0 sp=0xc0025d37c0 pc=0xb80576
runtime.mapassign(0x3368c20, 0xc002ea8f60, 0xc002d418a0?)
/src/runtime/map.go:686 +0x5ab fp=0xc0025d3888 sp=0xc0025d37f0 pc=0xb50f6b
github.com/Shopify/sarama.(*partitionMovements).addPartitionMovementRecord(0x4f8e410, {{0xc002d418e0, 0x1d}, 0x2}, {{0xc0042973e0, 0x2b}, {0xc0042973b0, 0x2b}})
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:922 +0xba fp=0xc0025d3958 sp=0xc0025d3888 pc=0x16942fa
github.com/Shopify/sarama.(*partitionMovements).movePartition(0x4f8e410, {{0xc002d418e0, 0x1d}, 0x2}, {0xc0042973e0, 0x2b}, {0xc0042973b0, 0x2b})
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:952 +0x3ad fp=0xc0025d3a80 sp=0xc0025d3958 pc=0x1694a0d
github.com/Shopify/sarama.(*stickyBalanceStrategy).processPartitionMovement(0x4f8e410, {{0xc002d418e0, 0x1d}, 0x2}, {0xc0042973b0, 0x2b}, 0xc002d7c540, {0xc002d35d00, 0x2, 0x2}, ...)
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:564 +0x11b fp=0xc0025d3b80 sp=0xc0025d3a80 pc=0x168f19b
github.com/Shopify/sarama.(*stickyBalanceStrategy).reassignPartition(0x4f8e410, {{0xc002d418e0, 0x1d}, 0x2}, 0xc002d7c540, {0xc002d35d00, 0x2, 0x2}, 0xc0025d4b60, {0xc0042973b0, ...})
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:558 +0x191 fp=0xc0025d3c60 sp=0xc0025d3b80 pc=0x168efd1
github.com/Shopify/sarama.(*stickyBalanceStrategy).reassignPartitionToNewConsumer(0x4f8e410, {{0xc002d418e0, 0x1d}, 0x2}, 0xc002d7c540, {0xc002d35d00, 0x2, 0x2}, 0xc0025d4b60, 0xc0025d4b90)
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:547 +0x217 fp=0xc0025d3d58 sp=0xc0025d3c60 pc=0x168ed57
github.com/Shopify/sarama.(*stickyBalanceStrategy).performReassignments(0x4f8e410, {0xc003cdf260, 0x3, 0x4}, 0xc002d7c540, 0xc002d7c570, {0xc002d35d00, 0x2, 0x2}, 0xc0025d4b90, ...)
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:530 +0x825 fp=0xc0025d3f08 sp=0xc0025d3d58 pc=0x168ea65
github.com/Shopify/sarama.(*stickyBalanceStrategy).balance(0x4f8e410, 0xc002d7c540, 0xc002d7c570, {0xc003cdf260, 0x3, 0x4}, {0x0, 0x0, 0x0}, {0xc002d35d00, ...}, ...)
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:322 +0x77d fp=0xc0025d46b0 sp=0xc0025d3f08 pc=0x168d03d
github.com/Shopify/sarama.(*stickyBalanceStrategy).Plan(0x4f8e410, 0xc002d7c480, 0xc002d7c4b0)
/go/pkg/mod/github.com/!shopify/[email protected]/balance_strategy.go:248 +0xc14 fp=0xc0025d5338 sp=0xc0025d46b0 pc=0x168b794
github.com/Shopify/sarama.(*consumerGroup).balance(0xc0000b6840, {0x3db0870, 0x4f8e410}, 0xc002d7c480)
/go/pkg/mod/github.com/!shopify/[email protected]/consumer_group.go:540 +0x39c fp=0xc0025d5578 sp=0xc0025d5338 pc=0x16c58fc
github.com/Shopify/sarama.(*consumerGroup).newSession(0xc0000b6840, {0x3db8e38, 0xc00049ba00}, {0xc002e3fad0, 0x1, 0x1}, {0x3db0480, 0xc000144698}, 0x4)
/go/pkg/mod/github.com/!shopify/[email protected]/consumer_group.go:351 +0xb74 fp=0xc0025d5aa8 sp=0xc0025d5578 pc=0x16c3014
github.com/Shopify/sarama.(*consumerGroup).Consume(0xc0000b6840, {0x3db8e38, 0xc00049ba00}, {0xc002e3fad0, 0x1, 0x1}, {0x3db0480, 0xc000144698})
/go/pkg/mod/github.com/!shopify/[email protected]/consumer_group.go:206 +0x2d3 fp=0xc0025d5c18 sp=0xc0025d5aa8 pc=0x16c1ab3
Problem Description
@insulaire are you configuring multiple consumers within the same go process using the same config? Looking at the code, sarama.BalanceStrategySticky
is a global singleton and would end up incorrectly sharing the movements partitionMovements
state between the separate clients.
It's also possible that concurrent access may have occurred even with a single client, but just thought I'd ask to verify up front
Yes, using the same configuration item for multiple consumers.
@dnwe This is the log before the exception
{"level":"INFO","time":"2023-01-18T15:13:19.597+0800","message":"client/metadata fetching metadata for all topics from broker 127.0.0.1:9092"}
{"level":"INFO","time":"2023-01-18T15:13:19.597+0800","message":"client/metadata fetching metadata for all topics from broker 127.0.0.1:9092"}
{"level":"INFO","time":"2023-01-18T15:13:19.597+0800","message":"client/metadata fetching metadata for all topics from broker 127.0.0.1:9092"}
{"level":"INFO","time":"2023-01-18T15:13:19.602+0800","message":"client/metadata fetching metadata for all topics from broker 127.0.0.1:9092"}
{"level":"INFO","time":"2023-01-18T15:13:29.767+0800","message":"consumergroup/MAKER-ORDER-EVENT-REQUEST-MT5 loop check partition number coroutine will exit, topics [MAKER-ORDER-EVENT-REQUEST-MT5]"}
{"level":"INFO","time":"2023-01-18T15:13:29.783+0800","message":"consumer/broker/1 closed dead subscription to MAKER-ORDER-EVENT-REQUEST-MT5/0"}
{"level":"INFO","time":"2023-01-18T15:13:29.784+0800","message":"consumergroup/MAKER-ORDER-EVENT-REQUEST-MT5-DELAY loop check partition number coroutine will exit, topics [MAKER-ORDER-EVENT-REQUEST-MT5-DELAY]"}
{"level":"INFO","time":"2023-01-18T15:13:29.799+0800","message":"consumer/broker/3 closed dead subscription to MAKER-ORDER-EVENT-REQUEST-MT5-DELAY/0"}
{"level":"INFO","time":"2023-01-18T15:13:29.822+0800","message":"consumer/broker/2 closed dead subscription to MAKER-ORDER-EVENT-REQUEST-MT5-DELAY/2"}
{"level":"INFO","time":"2023-01-18T15:13:29.873+0800","message":"consumer/broker/3 closed dead subscription to MAKER-ORDER-EVENT-REQUEST-MT5/2"}
{"level":"INFO","time":"2023-01-18T15:13:29.889+0800","message":"consumer/broker/2 closed dead subscription to MAKER-ORDER-EVENT-REQUEST-MT5/1"}
{"level":"INFO","time":"2023-01-18T15:13:29.889+0800","message":"consumergroup/session/server-1b613e0c-c02b-4d99-bf6e-c5517334046f/49 heartbeat loop stopped"}
{"level":"INFO","time":"2023-01-18T15:13:29.889+0800","message":"consumergroup/session/server-1b613e0c-c02b-4d99-bf6e-c5517334046f/49 released"}
{"level":"INFO","time":"2023-01-18T15:13:29.909+0800","message":"consumer/broker/1 closed dead subscription to MAKER-ORDER-EVENT-REQUEST-MT5-DELAY/1"}
{"level":"INFO","time":"2023-01-18T15:13:29.910+0800","message":"consumergroup/session/server-f6de5133-1571-4d29-8eed-9ec4261568f7/1 heartbeat loop stopped"}
{"level":"INFO","time":"2023-01-18T15:13:29.910+0800","message":"consumergroup/session/server-f6de5133-1571-4d29-8eed-9ec4261568f7/1 released"}
{"level":"INFO","time":"2023-01-18T15:13:30.894+0800","message":"client/metadata fetching metadata for [MAKER-ORDER-EVENT-REQUEST-MT5] from broker 127.0.0.1:9092"}
{"level":"INFO","time":"2023-01-18T15:13:30.916+0800","message":"client/metadata fetching metadata for [MAKER-ORDER-EVENT-REQUEST-MT5-DELAY] from broker 127.0.0.1:9092"}
We've been bit by this pretty frequently lately as well. Just to make it clearer, is this a bug or are we not supposed to share the config between different consumer instances?
@fabiomargarido I think at the moment you cannot share this global singleton across multiple consumer groups, but also there isn’t currently a way of initialising multiple instances of the strategy for each of your consumer groups