cruise-control icon indicating copy to clipboard operation
cruise-control copied to clipboard

invalid partition number increased everytime after intra-broker disk re-balance

Open linehrr opened this issue 6 years ago • 4 comments
trafficstars

Screen Shot 2019-07-18 at 2 50 57 PM

there is no useful logs from either CC or broker's metric collector. anyone knows what would be the possible cause?

Update: after we restarted some brokers, we are getting back all partitions as valid. therefore it's indicating that some brokers stopped sending metrics. During intra-broker rebalancing, there were some brokers failed due to timeout on logdir meta data fetching, and therefore stuck at moving internal log dirs. we did the manual reassignment to fix them but those brokers seem to stop sending metric afterwards, until we restarted them.

linehrr avatar Jul 18 '19 18:07 linehrr

@linehrr Does this issue still happen?

efeg avatar Jun 10 '20 02:06 efeg

We haven’t tested it against the newest Kafka version. But the old version 1.1 it is still happening.

linehrr avatar Jun 10 '20 02:06 linehrr

Dear @linehrr , @efeg does it means that there is no way to detect that some broker does not send metrics currently ?

I'm asking because in my test cluster, I see also this info:

image

And I was just trying to figure out what could be the problem... In fact we do not know what are the problematic partitions and to which topic/broker they belong to.

Best.

jrevillard avatar Feb 26 '21 10:02 jrevillard

As far as I'm concerned, brokers restart does not help

jrevillard avatar Feb 26 '21 11:02 jrevillard