pykafka icon indicating copy to clipboard operation
pykafka copied to clipboard

IllegalGeneration exception from SimpleConsumer

Open lovemfchen opened this issue 9 years ago • 9 comments

INFO:pykafka.balancedconsumer:Rebalancing consumer "pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6" for topic "my-replicated-topic".
INFO:pykafka.managedbalancedconsumer:Sending JoinGroupRequest for consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'
INFO:pykafka.balancedconsumer:pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6: Balancing 2 participants for 10 partitions. Owning 5 partitions.
DEBUG:pykafka.balancedconsumer:My partitions: ['my-replicated-topic-0-2', 'my-replicated-topic-0-5', 'my-replicated-topic-0-8', 'my-replicated-topic-1-3', 'my-replicated-topic-1-0']
INFO:pykafka.balancedconsumer:pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6: Balancing 2 participants for 10 partitions. Owning 5 partitions.
DEBUG:pykafka.balancedconsumer:My partitions: ['my-replicated-topic-2-7', 'my-replicated-topic-1-6', 'my-replicated-topic-1-9', 'my-replicated-topic-2-4', 'my-replicated-topic-2-1']
INFO:pykafka.managedbalancedconsumer:Sending SyncGroupRequest for consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'
DEBUG:pykafka.simpleconsumer:Committing offsets for 10 partitions to broker id 2
INFO:pykafka.simpleconsumer:Continuing in response to IllegalGeneration
ERROR:pykafka.simpleconsumer:Error committing offsets for topic 'my-replicated-topic' from consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
DEBUG:pykafka.simpleconsumer:Retrying
DEBUG:pykafka.simpleconsumer:Fetcher thread exiting
INFO:pykafka.simpleconsumer:Continuing in response to IllegalGeneration
ERROR:pykafka.simpleconsumer:Error committing offsets for topic 'my-replicated-topic' from consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
DEBUG:pykafka.simpleconsumer:Retrying
INFO:pykafka.simpleconsumer:Continuing in response to IllegalGeneration
ERROR:pykafka.simpleconsumer:Error committing offsets for topic 'my-replicated-topic' from consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
DEBUG:pykafka.simpleconsumer:Retrying
INFO:pykafka.simpleconsumer:Continuing in response to IllegalGeneration
ERROR:pykafka.simpleconsumer:Error committing offsets for topic 'my-replicated-topic' from consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
DEBUG:pykafka.simpleconsumer:Retrying
INFO:pykafka.simpleconsumer:Continuing in response to IllegalGeneration
ERROR:pykafka.simpleconsumer:Error committing offsets for topic 'my-replicated-topic' from consumer id 'pykafka-88ab9f20-80db-493c-9253-998a14cfb0d6'(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
INFO:pykafka.cluster:Attempting to discover offset manager for consumer group 'testgroup'
INFO:pykafka.cluster:Found coordinator broker with id 2
DEBUG:pykafka.simpleconsumer:Fetching offsets for 5 partitions from broker id 2
DEBUG:pykafka.simpleconsumer:Set offset for partition 1 to 282963
DEBUG:pykafka.simpleconsumer:Set offset for partition 4 to 283507
DEBUG:pykafka.simpleconsumer:Set offset for partition 9 to 284755
DEBUG:pykafka.simpleconsumer:Set offset for partition 6 to 284410
DEBUG:pykafka.simpleconsumer:Set offset for partition 7 to 282724

I use topic.get_balanced_consumer(consumer_group='testgroup',auto_commit_enable=True,managed=True, consumer_timeout_ms=1500) to get managedbalancedconsumer. this is a self-balancing consumer, but get a error

lovemfchen avatar Oct 25 '16 08:10 lovemfchen

Thanks @lovemfchen. Does this error happen immediately upon consumer initialization, or during a rebalance that occurs after it's been running for a while?

emmettbutler avatar Oct 25 '16 13:10 emmettbutler

it happern in a rebalance that I create a consummer with the same group id,I think the new SimpleConsumer have been created,but the old one still commit the offset with old generaton id

lovemfchen avatar Oct 30 '16 01:10 lovemfchen

We see a number of these upon initialization of a managed balanced consumer

razor-1 avatar May 11 '17 02:05 razor-1

I'm experiencing this same issue

athal7 avatar Jul 06 '17 21:07 athal7

Help is definitely wanted on this issue, since it concerns an API that we're not currently using in production at Parse.ly.

emmettbutler avatar Sep 21 '17 22:09 emmettbutler

@razor-1 If you're able, it would be very helpful to provide some more information about these errors on initialization. Perhaps you can post some log output that shows the problem, or some example code that can be used to replicate it?

emmettbutler avatar Feb 26 '18 23:02 emmettbutler

I see things like this:

2018-03-14 23:17:18,814 pykafka.simpleconsumer ERROR Error committing offsets for topic 'b'topic_name'' from consumer id 'b'pykafka-20acb16b-ea5f-44a7-a46d-456273907ea2''(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [1, 2]})

I can try to find the combination of kafka configs and pykafka code that make this reproducible but it might take some time

razor-1 avatar Mar 15 '18 01:03 razor-1

I think this is related:

Some times, when using managed=True, and if I am in the middle of a rebalance I get

ERROR (pykafka.simpleconsumer): Error committing offsets for topic 'b'segments'' from consumer id 'b'pykafka-4db4a00f-c3e8-405c-b455-5acc0efb1206''(errors: {<class  pykafka.exceptions.RebalanceInProgress'>: [0, 1, 2, 3, 4, 5]})

while other times in the same situation I get:

INFO (pykafka.simpleconsumer): Continuing in response to IllegalGeneration
2018-03-29 19:43:33 [24353] ERROR (pykafka.simpleconsumer): Error committing offsets for topic 'b'segments'' from consumer id 'b'pykafka-25ea9350-c0d4-4a8a-862e-457d9ae218b5''(errors: {<class 'pykafka.exceptions.IllegalGeneration'>: [0, 1, 2, 3, 4, 5]})

yet again other times everything works out cleanly.

My consumers always handle the rebalance cleanly if I use managed=False

Remark that my broker is very slow (it runs on a raspberry pi) which perhaps makes it easier to trigger the bug: I see it quite reproducibly.

I trigger the rebalance by shutting down some consumers by calling stop() on them.

jmoraleda avatar Mar 30 '18 02:03 jmoraleda

@jmoraleda It's really interesting that you find these failures more reliable on a slow processor. That might be key to some improvements we could make to the integration tests to make them shake out these types of timing issues. By the way, that's my hunch about what this might be - synchronization between the various internal threads in a consumer works a bit differently when using managed=True, so I'm not surprised different execution models/patterns exhibit the problem differently.

emmettbutler avatar Apr 02 '18 18:04 emmettbutler