sarama icon indicating copy to clipboard operation
sarama copied to clipboard

Remove duplicates from topics

Open Smriti-OSS opened this issue 3 years ago • 4 comments

This PR is intended to fix the issues observed in https://github.com/Shopify/sarama/issues/2264.

Summary of the issue: Passing duplicate topics into the Consume function leads to uneven partition division between different consumers of the consumer group.

References: Java client handles this issue by always discarding duplicates - as it collapses the collection of topics into a set.

Smriti-OSS avatar Jul 14 '22 10:07 Smriti-OSS

@dnwe Could you please review the issue and PR.

Smriti-OSS avatar Jul 14 '22 10:07 Smriti-OSS

Issue could be tested by adding this testcase to balance_strategy_test.go:

{ name: "2 members, 1 topic, 8 partitions each", members: map[string][]string{"M1": {"T1", "T1", "T1", "T1", "T1", "T1", "T1", "T1"}, "M2": {"T1", "T1", "T1", "T1", "T1", "T1", "T1", "T1"}}, topics: map[string][]int32{"T1": {0, 1, 2, 3, 4, 5, 6, 7}}, expected: BalanceStrategyPlan{ "M1": map[string][]int32{"T1": {0, 1, 2, 3}}, "M2": map[string][]int32{"T1": {4, 5, 6, 7}}, }, },

This fails with:

=== RUN TestBalanceStrategyRange === RUN TestBalanceStrategyRange/2_members,_1_topic,_8_partitions_each#01 balance_strategy_test.go:101: Plan does not match expectation expected: sarama.BalanceStrategyPlan{"M1":map[string][]int32{"T1":[]int32{0, 1, 2, 3}}, "M2":map[string][]int32{"T1":[]int32{4, 5, 6, 7}}} actual: sarama.BalanceStrategyPlan{"M1":map[string][]int32{"T1":[]int32{0, 1, 2, 3, 4, 5, 6, 7}}} --- FAIL: TestBalanceStrategyRange (0.00s) --- FAIL: TestBalanceStrategyRange/2_members,_1_topic,_8_partitions_each#01 (0.00s)

Smriti-OSS avatar Jul 19 '22 15:07 Smriti-OSS

@Smriti-OSS ah thanks — yes so the issue is that the consumer has duplicate topics in the consumer group metadata sent as part of the JoinGroupRequest and used for generating the assignments

dnwe avatar Jul 20 '22 12:07 dnwe

I've pushed up https://github.com/Shopify/sarama/pull/2285 with the sample unittest and to cover de-duplication in the balance_strategy — as I think it's worth doing it there as well as in the consumer group itself

dnwe avatar Jul 20 '22 14:07 dnwe

Thank you for your contribution! However, this pull request has not had any activity in the past 90 days and will be closed in 30 days if no updates occur. If you believe the changes are still valid then please verify your branch has no conflicts with main and rebase if needed. If you are awaiting a (re-)review then please let us know.

github-actions[bot] avatar Oct 10 '23 14:10 github-actions[bot]