kminion icon indicating copy to clipboard operation
kminion copied to clipboard

Kminion goes into crashback loop on Confluent Cloud Fully Managed

Open reidmeyer opened this issue 3 months ago • 14 comments

Kminion puts a partition on each broker so that it can accurately measure latency on each broker. If Kminion detects that a rebalancing of the cluster has caused there to be a situation other than 1 partition on each broker, Kminion will attempt to rearrange the partitions. On Confluent Cloud Fully Managed, Kminion can not get rights to do this action. Kminion then goes into a crash back loop. To fix, I delete the topic, and let Kminion recreate it.

I was thinking, it could be a feature for Kminion to delete/recreate the topic given this scenario, rather than requiring manual intervention.

reidmeyer avatar Sep 19 '25 12:09 reidmeyer

Or maybe I'm mistaken and I just need a certain ACL I'm missing.

I'm guessing ALTER on CLUSTER is what's required?

Pretty sure, even with this ACL, confluent blocks reassignment calls.

reidmeyer avatar Sep 19 '25 12:09 reidmeyer

Hey @reidmeyer , yes seems possible that managed providers block reassign partitions because they possibly have their own preference how to place replicas on the brokers. Cluster alter sounds like the right permission to me. I think adding an option to re-create the topic to re-balance would be a bit of a hack in kminion because you'd also need to recreate the consumer group, re-init producers etc., it's kind of a full kminion restart

weeco avatar Sep 19 '25 14:09 weeco

Definitely also feels like a bit of a hack. As of now, deleting the end-to-end topic, and letting kminion recreate it has been our quick fix when the cluster is rebalanced. I wonder if that works because creating topics with replica assignments is allowed, or if I'm just lucky and the partitions are always evenly distributed.

Do you have any other thoughts how kminion end-to-end functionality could work in a system that doesn't allow partition reassignment?

I'm of course also talking with Confluent about this.

reidmeyer avatar Sep 19 '25 14:09 reidmeyer

If you don't have the control how the replicas (and leaders) are placed on the cluster, you cannot ensure that you are testing all brokers effectively and you are dependent on the luck how they were placed. Ideally managed providers allow you to disable auto-rebalancing on a per-topic basis for this specific usecase, but they may argue that SLA measuring is their responsibility as part of the offering and I don't see other usecases for disabling (smart) auto rebalancing.

weeco avatar Sep 19 '25 14:09 weeco

Can you confirm, when kminion creates the end-to-end topic, does it decide where those partitions go, or does it let kafka decide, and then repartition if necessary?

reidmeyer avatar Sep 24 '25 11:09 reidmeyer

@reidmeyer I just looked it up , currently it lets Kafka decide where to place the replicas but right after creation it checks whether the replicas should be reassigned.

However, this changes with this PR: https://github.com/redpanda-data/kminion/pull/312 (significantly rewrite of the replica assignment logic). You can give this docker image a try if you think it will solve your problems. Relevant code piece: https://github.com/redpanda-data/kminion/blob/25b479c6b1549a723a0fa029e70835836e8334f0/e2e/topic.go#L292-L312

weeco avatar Sep 24 '25 11:09 weeco

I'm so impressed with your quick response. So in the new implementation of that PR, it decides the partitions on topic creation, using replica assignment?

This is definitely attractive, but my guess is this won't solve my problem, because doing a quick test creating a topic on confluent cloud with replica-assignment, it looks like it wasn't respected. My guess is that confluent ignores the replica assignment request. But this has been very helpful in understanding the situation.

reidmeyer avatar Sep 24 '25 12:09 reidmeyer

So in the new implementation of that PR, it decides the partitions on topic creation, using replica assignment?

To be precise it decides on the replica assignments (a map where all replicas of each partition should be placed on the brokers)

weeco avatar Sep 24 '25 12:09 weeco

Thanks I've just confirmed that sadly won't solve my problem since Confluent ignores replica assignment commands. Nonetheless, I think this is obviously not really a kminion problem. Also the feature I requested also doesn't make sense in that recreating the topic doesn't necessarily fix the partition assignment.

The one question I still have: I've seen, which the cluster has a partition reassignment take place, I've seen kminion go into a crashbackloop if it's unable to reassign the partitions. Is this the desired outcome? Perhaps it's also reasonable to keep kminion running, even if it can't balance the partitions across the brokers. I've seen It sometimes where it does a crashback loop, and sometimes where it doesn't (I guess it doesn't crash if the topic is created at startup by kminion, even if the replica assignment isn't as desired)

reidmeyer avatar Sep 24 '25 12:09 reidmeyer

I'm hoping you/someone can shed some light on the inner workings of kminion as I consider how I can make this work.

My end goal is to make kminion end-to-end work with confluent cloud, to have a situation where the end-to-end topic will be able to measure latency metrics on each broker.

I've been instructed by Confluent to increase the partitions per broker to 3; they say this will increase the likelihood (unsure if certain) that a partition leader will exist on each broker.

  1. If kminion detects that there are not exactly 3 partitions on each broker, I expect it will try to fix, and then still fail if it cannot?
  2. Is there a way to turn off the validation of the topic? I could make the reconciliationInterval very high?
  3. I suppose if I turn off topicmanagement, that will stop kminion from trying to alter topics? But it will still fail if the topic is not balanced as it should be?
  4. I see that failures come at strange times. If I have the topic already created, wrongly, I don't see failures.. For example, with 3 partitions per broker, kminion won't fail if there is only 1 partition per broker, even after the reconciliation time frame. I do see failures when confluent does a partition reassignment. Can you clarify, given my situation, when you might expect pod failures?
  5. Have you used kafka monitor / xinfra monitor? I might check this out as well.

reidmeyer avatar Dec 02 '25 12:12 reidmeyer

Probably it's best to create the topic yourself and turn of topic management and just let kminion use the topic you manually created. Without looking at the code again I think this should work.

weeco avatar Dec 02 '25 13:12 weeco

I think you're right. I'll try that. Thank you for your guidance :)

This of course doesn't solve the problem of keeping 1 partition on each broker, but nothing kminion can do about that :)

reidmeyer avatar Dec 09 '25 08:12 reidmeyer

Just to share something maybe unintentional. During the reconciliation check, if the kafka topic can't be changed, nothing goes wrong, it just silently fails the fix and continues normally.. During startup, on the other hand, if the kafka topic can't be changed, it goes into a restart loop.

A bit strange to have it function in different ways, don't you think?

reidmeyer avatar Dec 09 '25 14:12 reidmeyer

I'm not sure I understand the exact scenario. I tend to agree that it should be consistent behaviour. Sometimes I also prefer a fail-fast scenario for better user feedback. Are you referring to the case where topic management is enabled or disabled?

weeco avatar Dec 09 '25 14:12 weeco

I might be wrong, but I think this is the behavior I've noticed.

When topic management is enabled, and the topic is misbalanced by cruise control, kminion will then try to reassign partitions. I believe if it fails to do so (since confluence blocks this, or fails it silently), the pod doesn't fail, and the the exporter continues to function.

If I restart the pod, though, it will fail to start, because a bad topic arrangement is detected, and fixing it fails.

As you've instructed, I'm going to disable topic management, though..

reidmeyer avatar Dec 11 '25 09:12 reidmeyer