pulsar-client-go icon indicating copy to clipboard operation
pulsar-client-go copied to clipboard

Flaky test: TestConsumerSeekByTimeOnPartitionedTopic

Open BewareMyPower opened this issue 2 years ago • 1 comments

See example failure

logs: logs.txt

stacks: stacks.txt

BewareMyPower avatar Feb 28 '23 15:02 BewareMyPower

The root cause is that the message after seek may be cleaned up when cleaning messageCh. Causes messages loss, causing the test to be blocked in the Receive method

https://github.com/apache/pulsar-client-go/blob/42ded0d59c46fd3fdaad45f045f7e8bf091131a5/pulsar/consumer_test.go#L3684

In short, when calls SeekByTime sub-consumers success, sub-consumers will send messages to messageCh. So, the message after seeking may be cleaned up, and finally, lose these messages.

https://github.com/apache/pulsar-client-go/blob/d98c4f17c6f8927072d146f4a10c8df73e21be6e/pulsar/consumer_impl.go#L668-L686

Refer logs: hello-0 and hello-99 are messages after seek.

time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messageCh: 10" topic="persistent://public/default/my-topic-432510000"
time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messages: hello-890 publish time: <nil>" topic="persistent://public/default/my-topic-432510000"
time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messages: hello-891 publish time: <nil>" topic="persistent://public/default/my-topic-432510000"
time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messages: hello-892 publish time: <nil>" topic="persistent://public/default/my-topic-432510000"
time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messages: hello-893 publish time: <nil>" topic="persistent://public/default/my-topic-432510000"
time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messages: hello-0 publish time: <nil>" topic="persistent://public/default/my-topic-432510000"
time="2023-03-09T21:15:26+08:00" level=info msg="+++ clear messages: hello-99 publish time: <nil>" topic="persistent://public/default/my-topic-432510000"

This issue exists the java client, about more information, refer to PIP: https://github.com/apache/pulsar/issues/16757

This flaky test is not too easy to happen. I think we can wait for this PIP to finish.

shibd avatar Mar 09 '23 13:03 shibd