cruise-control icon indicating copy to clipboard operation
cruise-control copied to clipboard

Remove ZooKeeper

Open egyedt opened this issue 10 months ago • 2 comments

Summary

  1. Why: Kafka 4.0 will not support ZooKeeper.
  2. What: Remove ZooKeeper from Cruise Control.

Expected Behavior

Cruise Control is able to work without ZooKeeper and no ZooKeeper dependency is needed.

Actual Behavior

Cruise Control is able to work without ZooKeeper but ZooKeeper dependency is needed for some features and tests.

Categorization

  • [ ] documentation
  • [ ] bugfix
  • [ ] new feature
  • [X] refactor
  • [ ] security/CVE
  • [ ] other

This PR resolves #2243.

egyedt avatar Jan 29 '25 10:01 egyedt

Hi @bgrishinko, could you please review this change? Thanks in advance!

egyedt avatar Jan 31 '25 08:01 egyedt

Great effort! Kudos! I hope your PR gets reviewed.

danielgospodinow avatar Apr 10 '25 07:04 danielgospodinow

The 4+ Kafka was released a few months ago. Could you please review this PR to make the migration to 4+ easier? Thanks in advance!

cc. @bgrishinko, @CCisGG, @efeg

egyedt avatar Sep 23 '25 07:09 egyedt

I have added a new commit here to bump kafka to 4.0.0, because this is why we must remove Zookeeper. I am adding the related issue that this PR will also solve: https://github.com/linkedin/cruise-control/issues/2309

pnagy-cldr avatar Sep 25 '25 15:09 pnagy-cldr

This PR also requires updating the supported Java version: https://github.com/linkedin/cruise-control/pull/2308

pnagy-cldr avatar Sep 25 '25 17:09 pnagy-cldr

@kyguy the CI was green, but as @pnagy-cldr mentioned, in this current form (now also the Kafka 4.0 support is added to this PR) the #2308 is also required, that is why the CI is not green now.

We have separate commits at this PR (ZK removal, ZK removal from tests and Kafka 4.0 support) and also the #2308 is needed (remove Java 11 support). Should we cherry-pick #2308 changes into this PR? Or should we separate Kafka 4.0 support into a new PR? (If we will do this, then somehow Kafka 4.0 support must depend on this ZK removal and Java 11 support removal PR)

Furthermore, as you mentioned Admin API is used by default every place, where the usage of ZK Client was also possible. So there will be no change in the default funcionality and there will be no feature loss with this change. The only place where ZK was a bit useful is the backup store to save failed brokers. (But this was also deprecated, since failed.brokers.file.path can be used instead as persistent file storage <- this is the default behaviour.)

egyedt avatar Sep 26 '25 16:09 egyedt

the CI was green, but as @pnagy-cldr mentioned, in this current form (now also the Kafka 4.0 support is added to this PR) the https://github.com/linkedin/cruise-control/pull/2308 is also required, that is why the CI is not green now.

Thanks @egyedt, that makes sense.

We have separate commits at this PR (ZK removal, ZK removal from tests and Kafka 4.0 support) and also the https://github.com/linkedin/cruise-control/pull/2308 is needed (remove Java 11 support). Should we cherry-pick https://github.com/linkedin/cruise-control/pull/2308 changes into this PR? Or should we separate Kafka 4.0 support into a new PR? (If we will do this, then somehow Kafka 4.0 support must depend on this ZK removal and Java 11 support removal PR)

Personally, I would keep the ZK removal changes (including related test updates) in this PR and move the Kafka 4.0 support changes into a separate PR. From what I understand the ZK removal changes here don't dependent on Java 11 support removal, is that correct?

Splitting the work this way could help unblock this PR and make both PRs easier to review!

Furthermore, as you mentioned Admin API is used by default every place, where the usage of ZK Client was also possible. So there will be no change in the default funcionality and there will be no feature loss with this change.

That is awesome, thanks!

kyguy avatar Sep 26 '25 16:09 kyguy

@kyguy the CI was green, but as @pnagy-cldr mentioned, in this current form (now also the Kafka 4.0 support is added to this PR) the https://github.com/linkedin/cruise-control/pull/2308 is also required, that is why the CI is not green now.

Thanks for the change and also many thanks for the open source review for this important PR! I merged the PR 2308.

CCisGG avatar Sep 29 '25 01:09 CCisGG

@kyguy @danielgospodinow Now that Java 17 support has been merged, I rebased this PR and it should be green now. You asked earlier to create a new PR for Bumping kafka version to 4.0.0 instead of having it here. I can do it, but since Kafka 4.0.0 does not contain ZK anymore, It won't even compile. ZK removal is meaningless alone, it is needed for bumping Kafka. So I could either:

  1. Leave it as is. Bumping Kafka, and removing ZK is a separated commit
  2. Create a new PR for Kafka bump, but it will need to depend on ZK removal, otherwise it won't even compile. Until ZK removal is not merged, the new PR will still contain ZK removal commit to make sure it compiles, so the new PR will be the same as this one until ZK removal merge :) What sounds better for you guys?

pnagy-cldr avatar Sep 29 '25 09:09 pnagy-cldr

I agree, @pnagy-cldr. Makes sense.

Just renaming the title to something like Kafka 4.0 Support And ZooKeeper Removal to display the PR more clearly is fine by me.

Let's get @kyguy's opinion too.

danielgospodinow avatar Sep 29 '25 09:09 danielgospodinow

Now that Java 17 support has been merged, I rebased this PR and it should be green now.

Awesome, thanks!

I can do it, but since Kafka 4.0.0 does not contain ZK anymore, It won't even compile. ZK removal is meaningless alone, it is needed for bumping Kafka.

I understand and support the goal of adding Kafka 4.0.0+ support to Cruise Control, I want that as much as anyone! But I still believe it would be better to move the Kafka 4.0.0-specific changes into a separate PR. From what I've observed, the Cruise Control project is light on active maintainers and reviewers. Keeping PRs smaller and focused makes them easier to review and increases the chances of getting them merged.

From what I understand, this PR removes Cruise Control's dependency on Kafka's internal APIs surrounding Zookeeper, i.e. KafkaZkClient, while still retaining support for ZooKeeper-based Kafka clusters. This is a meaningful and important change to the Cruise Control codebase on its own. Therefore, I would recommend keeping the Zookeeper API usage changes of this PR (moving the Kafka 4.0.0 changes to another) and renaming the PR to something like: "Migrate off of internal Kafka Zookeeper APIs" and mention explicitly in the description that the Cruise Control codebase still supports running with Zookeeper-based Kafka clusters.

All that being said, I am not a maintainer, so I'll defer to you (my fellow community members) and the maintainers, just sharing my perspective in hopes of moving this forward!

kyguy avatar Sep 29 '25 14:09 kyguy

@danielgospodinow @CCisGG @kyguy I removed the kafka bump to 4.0.0 commit from this PR. Once this is merged, I will create a new PR based on main with Kafka version bump to 4.0.0.

pnagy-cldr avatar Sep 29 '25 15:09 pnagy-cldr

Thanks you all for the effort of drafting and reviewing this PR!

CCisGG avatar Sep 29 '25 16:09 CCisGG

@kyguy Oops! Sorry about it. I was trying to unblock people.

@egyedt @pnagy-cldr Kyle posted his review after this PR get merged. I think it is still valuable to honor these comments. Would you mind addressing this comment in the follow-up PR of bump kafka version to 4.0.0? Thank you.

CCisGG avatar Sep 29 '25 18:09 CCisGG

@CCisGG Thanks, we will address these in the next PR with @pnagy-cldr.

egyedt avatar Sep 30 '25 07:09 egyedt

Thanks for the fixes/responses @pnagy-cldr @egyedt!

kyguy avatar Sep 30 '25 14:09 kyguy