librdkafka icon indicating copy to clipboard operation
librdkafka copied to clipboard

unexcepted reset offsets occasionally

Open intfish123 opened this issue 2 years ago • 4 comments

Description

unexcepted reset offsets occasionally

How to reproduce

it's just occasionally.

IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.

Checklist

  • [x] librdkafka version (release number or git tag): v1.8.2
  • [x] Apache Kafka version: v2.2.0
  • [x] librdkafka client configuration:
c, err := kafka.NewConsumer(&kafka.ConfigMap{
		"bootstrap.servers":        strings.Join(address, ","),
		"group.id":                 groupId,
		"auto.offset.reset":        "latest",
		"enable.auto.offset.store": "false",
		"enable.auto.commit":       true,
		"auto.commit.interval.ms":  1000,
		"fetch.wait.max.ms":        1000,
	}) 
  • [x] Operating system: alpine:3.14
  • [ ] Provide logs (with debug=.. as necessary) from librdkafka
%4|1645709989.739|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [37]: offset reset (at offset 2067600870) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.744|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [32]: offset reset (at offset 2067607114) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.745|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [29]: offset reset (at offset 2067828573) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.746|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [31]: offset reset (at offset 2067937001) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.747|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [36]: offset reset (at offset 2067584343) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.747|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [39]: offset reset (at offset 2067679921) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.747|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [34]: offset reset (at offset 2067609393) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.748|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [30]: offset reset (at offset 2067497718) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.748|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [33]: offset reset (at offset 2067670796) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.748|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [38]: offset reset (at offset 2068422807) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.748|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [35]: offset reset (at offset 2067594265) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1645709989.752|OFFSET|rdkafka#consumer-1| [thrd:main]: raw_box_group_01 [28]: offset reset (at offset 2067619494) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
  • [x] Provide broker log excerpts
  • [x] Critical issue

intfish123 avatar Feb 25 '22 03:02 intfish123

Interesting.

Can you check the broker logs for any hints? Are the topics being deleted and re-created?

edenhill avatar Mar 01 '22 10:03 edenhill

Interesting.

Can you check the broker logs for any hints? Are the topics being deleted and re-created?

broker logs has no any hints. the topics is not re-recreated.

intfish123 avatar Mar 10 '22 09:03 intfish123

Interesting.

Can you check the broker logs for any hints? Are the topics being deleted and re-created?

It happened again yesterday. And I used [email protected] Topics: realtime_02, groupId: uv2_01 Broker logs:

[2022-03-09 15:15:38,428] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=404587) -- {group_id=uv2_01,generation_id=22,member_id=rdkafka-8a664177-c8f6-4766-b1b9-eec5d6d336a1,topics=[{topic=realtime_02,partitions=[{partition=20,offset=542783570,leader_epoch=-1,metadata=},{partition=21,offset=542568800,leader_epoch=-1,metadata=},{partition=22,offset=542369991,leader_epoch=-1,metadata=},{partition=23,offset=542231452,leader_epoch=-1,metadata=},{partition=24,offset=542501008,leader_epoch=-1,metadata=},{partition=25,offset=545875779,leader_epoch=-1,metadata=},{partition=26,offset=542787895,leader_epoch=-1,metadata=},{partition=27,offset=542365863,leader_epoch=-1,metadata=},{partition=28,offset=542470553,leader_epoch=-1,metadata=},{partition=29,offset=542370054,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.29.160.137:38514-219844;totalTime:1.689,requestQueueTime:0.029,localTime:0.3,remoteTime:1.288,throttleTime:0.024,responseQueueTime:0.018,sendTime:0.027,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
[2022-03-09 15:19:08,430] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=404664) -- {group_id=uv2_01,generation_id=22,member_id=rdkafka-8a664177-c8f6-4766-b1b9-eec5d6d336a1,topics=[{topic=realtime_02,partitions=[{partition=20,offset=542785186,leader_epoch=-1,metadata=},{partition=21,offset=542570360,leader_epoch=-1,metadata=},{partition=22,offset=542371626,leader_epoch=-1,metadata=},{partition=23,offset=542233063,leader_epoch=-1,metadata=},{partition=24,offset=542502564,leader_epoch=-1,metadata=},{partition=25,offset=545877323,leader_epoch=-1,metadata=},{partition=26,offset=542789435,leader_epoch=-1,metadata=},{partition=27,offset=542367450,leader_epoch=-1,metadata=},{partition=28,offset=542472192,leader_epoch=-1,metadata=},{partition=29,offset=542371597,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.29.160.137:38514-219844;totalTime:1.553,requestQueueTime:0.018,localTime:0.317,remoteTime:1.168,throttleTime:0.019,responseQueueTime:0.012,sendTime:0.016,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
[2022-03-09 15:21:08,432] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=404708) -- {group_id=uv2_01,generation_id=22,member_id=rdkafka-8a664177-c8f6-4766-b1b9-eec5d6d336a1,topics=[{topic=realtime_02,partitions=[{partition=20,offset=542786056,leader_epoch=-1,metadata=},{partition=21,offset=542571216,leader_epoch=-1,metadata=},{partition=22,offset=542372490,leader_epoch=-1,metadata=},{partition=23,offset=542233961,leader_epoch=-1,metadata=},{partition=24,offset=542503468,leader_epoch=-1,metadata=},{partition=25,offset=545878215,leader_epoch=-1,metadata=},{partition=26,offset=542790295,leader_epoch=-1,metadata=},{partition=27,offset=542368299,leader_epoch=-1,metadata=},{partition=28,offset=542473059,leader_epoch=-1,metadata=},{partition=29,offset=542372504,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.29.160.137:38514-219844;totalTime:1.766,requestQueueTime:0.03,localTime:0.341,remoteTime:1.278,throttleTime:0.022,responseQueueTime:0.038,sendTime:0.053,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
[2022-03-09 15:25:38,439] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=404807) -- {group_id=uv2_01,generation_id=22,member_id=rdkafka-8a664177-c8f6-4766-b1b9-eec5d6d336a1,topics=[{topic=realtime_02,partitions=[{partition=20,offset=542788128,leader_epoch=-1,metadata=},{partition=21,offset=542573241,leader_epoch=-1,metadata=},{partition=22,offset=542374491,leader_epoch=-1,metadata=},{partition=23,offset=542236053,leader_epoch=-1,metadata=},{partition=24,offset=542505555,leader_epoch=-1,metadata=},{partition=25,offset=545880264,leader_epoch=-1,metadata=},{partition=26,offset=542792333,leader_epoch=-1,metadata=},{partition=27,offset=542370404,leader_epoch=-1,metadata=},{partition=28,offset=542475169,leader_epoch=-1,metadata=},{partition=29,offset=542374574,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.29.160.137:38514-219844;totalTime:2.043,requestQueueTime:0.044,localTime:0.518,remoteTime:1.406,throttleTime:0.018,responseQueueTime:0.021,sendTime:0.033,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
[2022-03-09 15:34:47,605] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=150) -- {group_id=uv2_01,generation_id=24,member_id=rdkafka-9fac2776-27a6-4a06-8032-dad0459278cd,topics=[{topic=realtime_01,partitions=[{partition=20,offset=544106978,leader_epoch=-1,metadata=},{partition=21,offset=543571870,leader_epoch=-1,metadata=},{partition=22,offset=544053783,leader_epoch=-1,metadata=},{partition=23,offset=543530339,leader_epoch=-1,metadata=},{partition=24,offset=543785413,leader_epoch=-1,metadata=},{partition=25,offset=547256772,leader_epoch=-1,metadata=},{partition=26,offset=544164973,leader_epoch=-1,metadata=},{partition=27,offset=543728996,leader_epoch=-1,metadata=},{partition=28,offset=543823262,leader_epoch=-1,metadata=},{partition=29,offset=543614626,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.30.161.133:54582-312856;totalTime:1.676,requestQueueTime:0.025,localTime:0.329,remoteTime:1.259,throttleTime:0.019,responseQueueTime:0.017,sendTime:0.024,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
[2022-03-09 15:37:17,608] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=205) -- {group_id=uv2_01,generation_id=24,member_id=rdkafka-9fac2776-27a6-4a06-8032-dad0459278cd,topics=[{topic=realtime_01,partitions=[{partition=20,offset=544108175,leader_epoch=-1,metadata=},{partition=21,offset=543572999,leader_epoch=-1,metadata=},{partition=22,offset=544054955,leader_epoch=-1,metadata=},{partition=23,offset=543531493,leader_epoch=-1,metadata=},{partition=24,offset=543786565,leader_epoch=-1,metadata=},{partition=25,offset=547257880,leader_epoch=-1,metadata=},{partition=26,offset=544166095,leader_epoch=-1,metadata=},{partition=27,offset=543730164,leader_epoch=-1,metadata=},{partition=28,offset=543824350,leader_epoch=-1,metadata=},{partition=29,offset=543615710,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.30.161.133:54582-312856;totalTime:1.624,requestQueueTime:0.04,localTime:0.29,remoteTime:1.232,throttleTime:0.022,responseQueueTime:0.015,sendTime:0.022,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
[2022-03-09 15:43:17,613] INFO Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=6, clientId=rdkafka, correlationId=337) -- {group_id=uv2_01,generation_id=24,member_id=rdkafka-9fac2776-27a6-4a06-8032-dad0459278cd,topics=[{topic=realtime_01,partitions=[{partition=20,offset=544110993,leader_epoch=-1,metadata=},{partition=21,offset=543575798,leader_epoch=-1,metadata=},{partition=22,offset=544057725,leader_epoch=-1,metadata=},{partition=23,offset=543534357,leader_epoch=-1,metadata=},{partition=24,offset=543789356,leader_epoch=-1,metadata=},{partition=25,offset=547260682,leader_epoch=-1,metadata=},{partition=26,offset=544168843,leader_epoch=-1,metadata=},{partition=27,offset=543733008,leader_epoch=-1,metadata=},{partition=28,offset=543827254,leader_epoch=-1,metadata=},{partition=29,offset=543618575,leader_epoch=-1,metadata=}]}]},response:none from connection 10.31.254.128:9092-10.30.161.133:54582-312856;totalTime:1.49,requestQueueTime:0.025,localTime:0.262,remoteTime:1.141,throttleTime:0.017,responseQueueTime:0.021,sendTime:0.022,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)
...
...
...
[2022-03-09 15:28:13,827] INFO [GroupCoordinator 101]: Member rdkafka-95e6cf07-039a-4797-82ad-d19c9b6a1fcc in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:13,827] INFO [GroupCoordinator 101]: Preparing to rebalance group uv2_01 in state PreparingRebalance with old generation 22 (__consumer_offsets-19) (reason: removing member rdkafka-95e6cf07-039a-4797-82ad-d19c9b6a1fcc on LeaveGroup) (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:14,075] INFO [GroupCoordinator 101]: Member rdkafka-e79a3e2a-c0d7-4669-a194-dbf8a60fceec in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:15,354] INFO [GroupCoordinator 101]: Member rdkafka-8a664177-c8f6-4766-b1b9-eec5d6d336a1 in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:15,356] INFO [GroupCoordinator 101]: Member rdkafka-2267f43a-1bf2-4865-b0ff-5d60383a7a6b in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:15,891] INFO [GroupCoordinator 101]: Stabilized group uv2_01 generation 23 (__consumer_offsets-19) (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:15,895] INFO [GroupCoordinator 101]: Assignment received from leader for group uv2_01 for generation 23 (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:16,152] INFO [GroupCoordinator 101]: Member rdkafka-37fd0677-5e6a-4645-b2e6-e4e53372a445 in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:16,152] INFO [GroupCoordinator 101]: Preparing to rebalance group uv2_01 in state PreparingRebalance with old generation 23 (__consumer_offsets-19) (reason: removing member rdkafka-37fd0677-5e6a-4645-b2e6-e4e53372a445 on LeaveGroup) (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:16,158] INFO [GroupCoordinator 101]: Member rdkafka-1960da51-64b0-4340-b766-296e16f54055 in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:17,001] INFO [GroupCoordinator 101]: Member rdkafka-e53cf696-f9dc-4a73-b2fe-b7692fbb2b2f in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:17,014] INFO [GroupCoordinator 101]: Member rdkafka-76a45391-589d-4fc6-bb58-7db2aa63cb42 in group uv2_01 has left, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:18,920] INFO [GroupCoordinator 101]: Stabilized group uv2_01 generation 24 (__consumer_offsets-19) (kafka.coordinator.group.GroupCoordinator)
[2022-03-09 15:28:18,932] INFO [GroupCoordinator 101]: Assignment received from leader for group uv2_01 for generation 24 (kafka.coordinator.group.GroupCoordinator)

client logs(2022-03-09 15:28:18):

%4|1646810899.013|OFFSET|rdkafka#consumer-2| [thrd:main]: realtime_02 [21]: offset reset (at offset 516326517) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1646810899.066|OFFSET|rdkafka#consumer-2| [thrd:main]: realtime_02 [20]: offset reset (at offset 516510622) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1646810899.077|OFFSET|rdkafka#consumer-2| [thrd:main]: realtime_02 [23]: offset reset (at offset 515995382) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range
%4|1646810899.086|OFFSET|rdkafka#consumer-2| [thrd:main]: realtime_02 [22]: offset reset (at offset 516149395) to END: fetch failed due to requested offset not available on the broker: Broker: Offset out of range

intfish123 avatar Mar 10 '22 09:03 intfish123

What's the topic compaction and retention config? Looks like it might just be deleting old segments.

edenhill avatar Mar 29 '22 14:03 edenhill

Experiencing the same: it sounds pretty scary but apparently the consumer is not reconsuming the topic, and new messages are consumed as usual Using confluent-kafka-python 1.9.2 against broker cp-kafka:5.1.2

ffissore avatar Feb 20 '23 14:02 ffissore