cruise-control icon indicating copy to clipboard operation
cruise-control copied to clipboard

Upgrade Kafka to 3.8.0

Open akatona84 opened this issue 1 year ago • 3 comments
trafficstars

akatona84 avatar Aug 12 '24 15:08 akatona84

This looks great, hopefully someone can review this given 3.8.0 has been released for a month now.

richardwalsh avatar Aug 27 '24 18:08 richardwalsh

@mhratson Could you please take a look on this useful PR?

With this, CC will be able to be compiled and to be able to running with Kafka 3.8.0.

egyedt avatar Sep 09 '24 09:09 egyedt

can we merge this on main branch @mhratson / @egyedt

saikumarstkm avatar Oct 17 '24 09:10 saikumarstkm

@mhratson can merge it, if he thinks it is OK.

egyedt avatar Oct 29 '24 09:10 egyedt

rebased & resolved conflict

akatona84 avatar Nov 28 '24 12:11 akatona84

Looks good to me.

bgrishinko avatar Jan 13 '25 23:01 bgrishinko

Hi @akatona84 @egyedt @bgrishinko , fyi we may need to revert this PR unfortunately because it break our internal build. The reason is LinkedIn today is still using kafka 3.0.*, which has some incompatibility with this change.

I think the general guidance is when we make API related changes we check the kafka version and decide which to use at runtime based on the kafka version. See #2254 as an example. Sorry for any inconvenience.

CCisGG avatar Mar 20 '25 23:03 CCisGG

@CCisGG , could u point where was the incompatibility happened? Maybe I'll try to make it better. I might have missed that.

akatona84 avatar Mar 21 '25 15:03 akatona84

Hi @akatona84, we are still identifying the incompatibilities. One example is:

https://github.com/linkedin/cruise-control/blob/f472590261a6c164d1cc9eaa8f3f1dfd9784faa7/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/config/constants/ExecutorConfig.java#L17

Basically we are using kafka 3.0 internally, and our unit tests is throwing runtime exception:

Caused by:
        java.lang.ClassNotFoundException: org.apache.kafka.server.config.ZkConfigs

I'm assuming the this is a class introduced at kafka 3.0+ version. There might be many other similar incompatibilities. Unfortunately we are still at kafka 3.0.* branch, so we would ask the contributors to make changes compatible to kafka 3.0.

I'm seeing that you are changing many config classes in this PR, and I'm not sure a good way to verify if these changes are compatible to kafka 3.0, but we at least already know ZkConfigs is not compatible.

CCisGG avatar Mar 21 '25 18:03 CCisGG

Hi @CCisGG, @akatona84,

I would prefer a backward compatible solution instead of a full revert of this change ("Upgrade Kafka to 3.8.0" - commit). For the mentioned ZkConfigs related issue, it is a possible workaround to use the old value, if the version is smaller than 3.8.0 (or the Kafka version where it was modified) and use the current ZkConfigs solution in case of 3.8.0<= versions.

If you share all of the incompatibilities, then we can try to create a backward compatible commit!

Thanks in advance!

egyedt avatar Mar 24 '25 10:03 egyedt

@egyedt Thanks for the response. I think fixing forward is probably better than revert this PR at this moment. Let's do it.

I'm seeing many config class changes in this PR are not backwards compatible, e.g.

import org.apache.kafka.coordinator.group.GroupCoordinatorConfig; import org.apache.kafka.network.SocketServerConfigs; import org.apache.kafka.server.config.ReplicationConfigs; import org.apache.kafka.server.config.ServerLogConfigs; import org.apache.kafka.server.config.ServerConfigs; import org.apache.kafka.server.config.ZkConfigs; import org.apache.kafka.storage.internals.log.CleanerConfig; import org.apache.kafka.common.config.SslConfigs;

We need to let CC to choose old config class at runtime on old kafka versions. Could you guys help with it?

CCisGG avatar Mar 24 '25 16:03 CCisGG

Thanks for the response. I think fixing forward is probably better than revert this PR at this moment. Let's do it.

Hi @CCisGG, would it be possible to isolate new API usage changes like these on a separate migrate_to_kafka_4.0 branch similar to how we have done with migrate_to_kafka_x_x branches in the past? Although we could create a fix that would keep Cruise Control compatible with Kafka 3.0 as we have done with other API changes in the past, I am worried a backward-compatible fix to address all the Kafka 3.8 API changes could add a significant amount of complexity to the code base. If it isn't too much overhead to maintain a new migration branch, this may be an easier option.

kyguy avatar Mar 24 '25 19:03 kyguy

@kyguy Good point. We are actually considering that option and will discuss with the team today.

CCisGG avatar Mar 24 '25 19:03 CCisGG

@kyguy @CCisGG I think it would be a good solution to create a new branch that is compatible with a 4.0 version as we would consider the 4.0 migration in the near future and would contribute some changes back. We currently have some similar changes for Kafka 3.9 too in #2247, so please let us know what should we do about it as based on your discussions we'd either need to make it runtime compatible with 3.0 or make the PR against another branch. On a slightly related note, we'd like to add the ability to compile on Java 17 on #2248. Would you please consider this as well in your team discussions? I mean whether CC could potentially support Java 17 builds and if there's anything else to do there.

viktorsomogyi avatar Mar 25 '25 11:03 viktorsomogyi

Hi @viktorsomogyi @kyguy and other folks, to support the majority of users who are using newer kafka versions, we decided to leave main branch to work with kafka 3.8+. Hopefully all the changes you make to support kafka 3.8, 3.9, 4.0 would be compatible naturally with the main branch. For our case, we created the migrate_to_kafka_3_0 branch, which will support kafka 3.0 to 3.7. We will create CC release 3.0.* from that branch.

Hopefully it will make things easier for you guys to add 3.8,3.9,4.0 support, as you don't need to worry about the compatibility issue with older kafka versions.

CCisGG avatar Mar 25 '25 19:03 CCisGG