sarama
sarama copied to clipboard
Sarama has stuck on the batch with zero records length.
Versions
Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Sarama | Kafka | Go |
---|---|---|
Version 1.29.1 | AWS Managed Kafka | 1.17 |
Configuration
What configuration values are you using for Sarama and Kafka?
cfg.Net.MaxOpenRequests = 1
// consume all messages
cfg.Consumer.IsolationLevel = sarama.ReadUncommitted
// require producer idempotency
cfg.Producer.Idempotent = true
// require consistency on Kafka side before considering the message produced
cfg.Producer.RequiredAcks = sarama.WaitForAll
// specify network keep-alive to reduce 'broken pipe' errors on non-KMS environments
cfg.Net.KeepAlive = time.Second * 30
// increase processing time from 1s (default) to 5s
cfg.Consumer.MaxProcessingTime = time.Second * 5
Logs
When filing an issue please provide logs from Sarama and Kafka if at all
possible. You can set sarama.Logger
to a log.Logger
to capture Sarama debug
output.
logs: CLICK ME
2021/10/22 10:38:26.384689 lvl=INF component=kafka records batch &{5805968 49 2 snappy 0 false false 0 2021-09-30 11:45:34.617 +0200 CEST 2021-09-30 11:45:34.617 +0200 CEST 82609 0 1563 [0xc00030e150] false false [] 1326}
2021/10/22 10:38:26.384705 lvl=INF component=kafka records batch &{5806083 49 2 snappy 0 false false 0 2021-09-30 12:00:01.188 +0200 CEST 2021-09-30 12:00:01.188 +0200 CEST 82609 0 1678 [0xc00030e1c0] false false [] 733}
2021/10/22 10:38:26.384728 lvl=INF component=kafka records batch &{5806313 49 2 snappy 0 false false 0 2021-09-30 12:29:55.542 +0200 CEST 2021-09-30 12:29:55.542 +0200 CEST 82609 0 1908 [0xc00030e230] false false [] 10546}
2021/10/22 10:38:26.384751 lvl=INF component=kafka records batch &{5806399 49 2 snappy 0 false false 0 2021-09-30 12:42:07.755 +0200 CEST 2021-09-30 12:42:07.755 +0200 CEST 82609 0 1994 [0xc00030e2a0] false false [] 1130}
2021/10/22 10:38:26.384772 lvl=INF component=kafka records batch &{5806689 49 2 snappy 0 false false 0 2021-09-30 13:25:25.119 +0200 CEST 2021-09-30 13:25:25.119 +0200 CEST 82609 0 2284 [0xc00030e380] false false [] 2934}
2021/10/22 10:38:26.384790 lvl=INF component=kafka records batch &{5807057 49 2 snappy 0 false false 0 2021-09-30 14:24:12.229 +0200 CEST 2021-09-30 14:24:12.229 +0200 CEST 82609 0 2652 [0xc00030e3f0] false false [] 3124}
2021/10/22 10:38:26.384807 lvl=INF component=kafka records batch &{5807106 49 2 snappy 0 false false 0 2021-09-30 14:31:33.337 +0200 CEST 2021-09-30 14:31:33.337 +0200 CEST 82609 0 2701 [0xc00030e460] false false [] 2335}
2021/10/22 10:38:26.384824 lvl=INF component=kafka records batch &{5807150 49 2 snappy 0 false false 0 2021-09-30 14:38:49.18 +0200 CEST 2021-09-30 14:38:49.18 +0200 CEST 82609 0 2745 [0xc00030e4d0] false false [] 2917}
2021/10/22 10:38:26.384858 lvl=INF component=kafka records batch &{5807374 49 2 snappy 0 false false 0 2021-09-30 15:12:07.587 +0200 CEST 2021-09-30 15:12:07.587 +0200 CEST 82609 0 2969 [0xc00030e540] false false [] 5744}
2021/10/22 10:38:26.384933 lvl=INF component=kafka records batch &{5807388 49 2 snappy 0 false false 0 2021-09-30 15:14:10.358 +0200 CEST 2021-09-30 15:14:10.358 +0200 CEST 82609 0 2983 [0xc00030e5b0] false false [] 31539}
2021/10/22 10:38:26.384958 lvl=INF component=kafka records batch &{5807391 49 2 snappy 0 false false 0 2021-09-30 15:14:44.664 +0200 CEST 2021-09-30 15:14:44.664 +0200 CEST 82609 0 2986 [0xc00030e620] false false [] 2135}
2021/10/22 10:38:26.385027 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.418552 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.450115 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.481958 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.522805 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.557499 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.593690 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.625917 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.657295 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.775601 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.806612 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.840360 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.871158 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
Problem Description
I'm using Goka (Sarama based library) and have some problems with reading from the one partition of 4. The consumer has stuck on the one particular batch and rereads it again and again.
Here is some info: This is batch struct:
type RecordBatch struct {
FirstOffset int64
PartitionLeaderEpoch int32
Version int8
Codec CompressionCodec
CompressionLevel int
Control bool
LogAppendTime bool
LastOffsetDelta int32
FirstTimestamp time.Time
MaxTimestamp time.Time
ProducerID int64
ProducerEpoch int16
FirstSequence int32
Records []*Record
PartialTrailingRecord bool
IsTransactional bool
compressedRecords []byte
recordsLen int // uncompressed records size
}
I added some logs for all decoded batches and firstly I see this:
2021/10/22 10:38:26.384671 lvl=INF component=kafka records batch &{5805856 49 2 snappy 0 false false 0 2021-09-30 11:29:46.662 +0200 CEST 2021-09-30 11:29:46.662 +0200 CEST 82609 0 1451 [0xc00030e0e0] false false [] 1735}
2021/10/22 10:38:26.384689 lvl=INF component=kafka records batch &{5805968 49 2 snappy 0 false false 0 2021-09-30 11:45:34.617 +0200 CEST 2021-09-30 11:45:34.617 +0200 CEST 82609 0 1563 [0xc00030e150] false false [] 1326}
2021/10/22 10:38:26.384705 lvl=INF component=kafka records batch &{5806083 49 2 snappy 0 false false 0 2021-09-30 12:00:01.188 +0200 CEST 2021-09-30 12:00:01.188 +0200 CEST 82609 0 1678 [0xc00030e1c0] false false [] 733}
2021/10/22 10:38:26.384728 lvl=INF component=kafka records batch &{5806313 49 2 snappy 0 false false 0 2021-09-30 12:29:55.542 +0200 CEST 2021-09-30 12:29:55.542 +0200 CEST 82609 0 1908 [0xc00030e230] false false [] 10546}
2021/10/22 10:38:26.384751 lvl=INF component=kafka records batch &{5806399 49 2 snappy 0 false false 0 2021-09-30 12:42:07.755 +0200 CEST 2021-09-30 12:42:07.755 +0200 CEST 82609 0 1994 [0xc00030e2a0] false false [] 1130}
2021/10/22 10:38:26.384772 lvl=INF component=kafka records batch &{5806689 49 2 snappy 0 false false 0 2021-09-30 13:25:25.119 +0200 CEST 2021-09-30 13:25:25.119 +0200 CEST 82609 0 2284 [0xc00030e380] false false [] 2934}
2021/10/22 10:38:26.384790 lvl=INF component=kafka records batch &{5807057 49 2 snappy 0 false false 0 2021-09-30 14:24:12.229 +0200 CEST 2021-09-30 14:24:12.229 +0200 CEST 82609 0 2652 [0xc00030e3f0] false false [] 3124}
2021/10/22 10:38:26.384807 lvl=INF component=kafka records batch &{5807106 49 2 snappy 0 false false 0 2021-09-30 14:31:33.337 +0200 CEST 2021-09-30 14:31:33.337 +0200 CEST 82609 0 2701 [0xc00030e460] false false [] 2335}
2021/10/22 10:38:26.384824 lvl=INF component=kafka records batch &{5807150 49 2 snappy 0 false false 0 2021-09-30 14:38:49.18 +0200 CEST 2021-09-30 14:38:49.18 +0200 CEST 82609 0 2745 [0xc00030e4d0] false false [] 2917}
2021/10/22 10:38:26.384858 lvl=INF component=kafka records batch &{5807374 49 2 snappy 0 false false 0 2021-09-30 15:12:07.587 +0200 CEST 2021-09-30 15:12:07.587 +0200 CEST 82609 0 2969 [0xc00030e540] false false [] 5744}
2021/10/22 10:38:26.384933 lvl=INF component=kafka records batch &{5807388 49 2 snappy 0 false false 0 2021-09-30 15:14:10.358 +0200 CEST 2021-09-30 15:14:10.358 +0200 CEST 82609 0 2983 [0xc00030e5b0] false false [] 31539}
2021/10/22 10:38:26.384958 lvl=INF component=kafka records batch &{5807391 49 2 snappy 0 false false 0 2021-09-30 15:14:44.664 +0200 CEST 2021-09-30 15:14:44.664 +0200 CEST 82609 0 2986 [0xc00030e620] false false [] 2135}
Looks ok. But then I see:
2021/10/22 10:38:26.385027 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.418552 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.450115 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.481958 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.522805 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.557499 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.593690 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.625917 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.657295 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.775601 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.806612 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.840360 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.871158 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.901792 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.934249 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
and it keeps going indefinitely. I dug into the code and figured out that there is no exit from such situation when the records len is 0. Sarama just reads this batch again and again. A high watermark was not reached. We use AWS Managed Kafka in this case. Please help. We are firing.
Hello! I just hit the same error as you, but your solution doesn't seem to work as the app is stuck in a loop and the offset isn't changing... Any idea how to fix this? :)
2021-12-31T10:32:39.030 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.031 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.031 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.032 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.033 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.034 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.035 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.036 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.037 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.038 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.039 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.040 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.041 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.042 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.042 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.043 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.044 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.045 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.046 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.046 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.047 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.048 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.049 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.050 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.051 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.051 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.052 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.053 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.054 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.055 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.055 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.056 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.057 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.058 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.059 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.060 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.061 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.062 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.062 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.063 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.064 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.065 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
2021-12-31T10:32:39.066 consumer/broker/1004 received batch with zero records but high watermark was not reached, topic ***, partition 34, offset 15138131
Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the main branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.
Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the main branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.
I experienced the same issue with Sarama 1.42.1. Apparently we had some issues with one of our Kafka nodes in the clusters that triggered this behaviour in our Sarama consumers. We have Kafka consumers in both Scala and Go, and only the Sarama client got into this endless loop, the Scala client handled the issue just fine. We are not sure what the issue with Kafka itself was yet, but apparently there is an edge case here with Sarama that make it enter into an endless loop with the error above and stops consuming messages.
Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the main branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.