kafka icon indicating copy to clipboard operation
kafka copied to clipboard

KAFKA-17338 ConsumerConfig should prevent using partition assignors with CONSUMER group protocol

Open m1a2st opened this issue 1 year ago • 9 comments

Jira: https://issues.apache.org/jira/browse/KAFKA-17338 add a validator for consumer group protocol, when setting partition.assignment.strategy

Committer Checklist (excluded from commit message)

  • [ ] Verify design and implementation
  • [ ] Verify test coverage and CI build status
  • [ ] Verify documentation (including upgrade notes)

m1a2st avatar Aug 16 '24 13:08 m1a2st

@chia7712, Thanks for your comments, I have addressed all the comments, PTAL

m1a2st avatar Aug 22 '24 12:08 m1a2st

@m1a2st please fix the failed tests

chia7712 avatar Sep 01 '24 03:09 chia7712

High level comment, there are other properties in the exact same situation, basically properties that do not make sense with the new protocol, and we want to throw ConfigException if the user provides a value for them (we also want to remove the defaults as @chia7712 pointed out, to remove them from the log and avoid confusion).

Ok with me if we prefer to tackle them separately (I can file the jiras), but the implementation of the check seems the same, so wonder if we should consider it here, and instead of having a checkPartitionAssigmentStrategy (that we'll end replicating for each of these props), we have a single checkDeprecatedPropertyUsed(propId) (or a better name), doing the exact same check that we're considering here for the partition strategy. I expect we would need the same check for all these that are deprecated with the new protocol (still thinking if I'm missing another one):

  • heartbeat.interval.ms
  • session.timeout.ms
  • group.max.session.timeout.ms
  • group.mix.session.timeout.ms

Thoughts?

lianetm avatar Sep 04 '24 13:09 lianetm

I think that I can address this check method in this PR, and remove these config default value when user config the consumer group protocol.

m1a2st avatar Sep 04 '24 15:09 m1a2st

heartbeat.interval.ms session.timeout.ms group.max.session.timeout.ms group.mix.session.timeout.ms

+1 and thanks @m1a2st to address them in this PR :)

chia7712 avatar Sep 04 '24 15:09 chia7712

group.max.session.timeout.ms group.mix.session.timeout.ms

Those configs are used by server rather than client. The new coordinator supports CLASSIC protocol so we don't need to unset those configs I think.

chia7712 avatar Sep 05 '24 13:09 chia7712

@m1a2st—I wanted to make sure that you weren't held up waiting for input from others. After the number of review comments gets to a certain size, it can be hard to tell where we stand. Let me know if you need us to review or respond to any outstanding questions. Thanks!

kirktrue avatar Oct 08 '24 22:10 kirktrue

@kirktrue, Thank you for your comments. I have a question regarding this:

  • Should we treat this as an issue and resolve it?
  • Or, is it not considered an issue, and setting it to zero is the appropriate solution?

m1a2st avatar Oct 09 '24 14:10 m1a2st

@m1a2st could you please rebase code?

chia7712 avatar Oct 17 '24 13:10 chia7712

@m1a2st please check the failed tests

chia7712 avatar Oct 26 '24 22:10 chia7712

@m1a2st—just checking in on this PR. It seems like we're really close. Is there anything you need from @lianetm or myself? Thanks!

kirktrue avatar Nov 20 '24 01:11 kirktrue

Hello @kirktrue, I think if this PR pass the CI, and won't have comments, we should merge this PR, WDYT?

m1a2st avatar Nov 20 '24 10:11 m1a2st

Thanks for @lianetm review, addressed all comments

m1a2st avatar Nov 23 '24 03:11 m1a2st

Hi @m1a2st , could you please check the test failure on testDeleteConsumerGroupOffsets? Thanks!

lianetm avatar Nov 27 '24 21:11 lianetm

Hello @lianetm, I think this fail test is not related with this PR, I will file a Jira to trace this Issue.

Expected an exception of type org.apache.kafka.common.errors.UnknownTopicOrPartitionException; got type org.apache.kafka.common.errors.GroupIdNotFoundException
Expected :true
Actual   :false
<Click to see difference>

org.opentest4j.AssertionFailedError: Expected an exception of type org.apache.kafka.common.errors.UnknownTopicOrPartitionException; got type org.apache.kafka.common.errors.GroupIdNotFoundException ==> expected: <true> but was: <false>
	at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
	at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
	at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
	at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
	at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)

Tracing Jira : https://issues.apache.org/jira/browse/KAFKA-18110

m1a2st avatar Nov 28 '24 01:11 m1a2st

I think this fail test is not related with this PR, I will file a Jira to trace this Issue.

Hello @lianetm The main reason I think is SESSION_TIMEOUT_MS_CONFIG not ignore when we using CONSUMER protocol, Im not sure why fail test show AssertionFailedError in my local before, Could you please retrigger CI and check all test pass, Thanks

Tracing Jira : https://issues.apache.org/jira/browse/KAFKA-18110

And I will close this Jira.

m1a2st avatar Nov 28 '24 22:11 m1a2st

@m1a2st This pull request is causing numerous Connect end-to-end tests to fail. I have opened KAFKA-18132 to address and fix the issue.

chia7712 avatar Dec 02 '24 13:12 chia7712

Oh thanks for filing @chia7712. I'm already following the new issue to help if needed. Thanks!

lianetm avatar Dec 02 '24 14:12 lianetm

FYI @m1a2st / @chia7712 , found similar situation with consumer system tests that need to be updated, filed https://issues.apache.org/jira/browse/KAFKA-18176

lianetm avatar Dec 06 '24 20:12 lianetm