cruise-control
cruise-control copied to clipboard
invalid partition number increased everytime after intra-broker disk re-balance

there is no useful logs from either CC or broker's metric collector. anyone knows what would be the possible cause?
Update: after we restarted some brokers, we are getting back all partitions as valid. therefore it's indicating that some brokers stopped sending metrics. During intra-broker rebalancing, there were some brokers failed due to timeout on logdir meta data fetching, and therefore stuck at moving internal log dirs. we did the manual reassignment to fix them but those brokers seem to stop sending metric afterwards, until we restarted them.
@linehrr Does this issue still happen?
We haven’t tested it against the newest Kafka version. But the old version 1.1 it is still happening.
Dear @linehrr , @efeg does it means that there is no way to detect that some broker does not send metrics currently ?
I'm asking because in my test cluster, I see also this info:

And I was just trying to figure out what could be the problem... In fact we do not know what are the problematic partitions and to which topic/broker they belong to.
Best.
As far as I'm concerned, brokers restart does not help