cruise-control icon indicating copy to clipboard operation
cruise-control copied to clipboard

Feature Request: Automatic CPU capacity detection via metric reporter

Open kyguy opened this issue 3 years ago • 3 comments

Currently, broker capacities for resources such as CPU must either be provided by users through a capacity.json file when using the BrokerCapacityConfigFileResolver [1] or detected by a custom BrokerConfigCapacityResolver plugin. [2] Having the metric reporter detect and report the CPU capacity of brokers would save users and third-party applications from this burden! The metric reporter could use the following line to get the number or cores available to the broker:

Runtime.getRuntime().availableProcessors()

and then report that metric in the same manner it does for CPU utilization. Cruise Control could then use this reported capacity value when the allow_capacity_estimation flag is set to true to set the CPU capacities of brokers in the cluster model.

Let me know what you think! If it sounds like a reasonable request, I would be happy to contribute this feature!

[1] https://github.com/linkedin/cruise-control/blob/migrate_to_kafka_2_4/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/config/BrokerCapacityConfigFileResolver.java [2] https://github.com/linkedin/cruise-control/wiki/Pluggable-Components#broker-capacity-config-resolver

kyguy avatar Mar 22 '22 14:03 kyguy

Any thoughts/concerns on this @efeg?

kyguy avatar Apr 05 '22 23:04 kyguy

@kyguy Thanks for the proposal and offer for contribution! I see the intention is to make it easier for users to resolve the capacity of brokers. To achieve that, I'd recommend using a custom BrokerCapacityConfigResolver, which would helps us keep the capacity information self-contained rather than being split into metrics reporter.

Cruise Control could then use this reported capacity value when the allow_capacity_estimation flag is set to true to set the CPU capacities of brokers in the cluster model.

allow_capacity_estimation has a different use today -- it checks whether a broker capacity can be estimated from other brokers in the cluster in case its capacity information is missing. To ensure backwards compatibility, I'd suggest maintaining the existing behavior.

efeg avatar Apr 07 '22 22:04 efeg

Thanks for the reply @efeg!

I see the intention is to make it easier for users to resolve the capacity of brokers. To achieve that, I'd recommend using a custom BrokerCapacityConfigResolver, which would helps us keep the capacity information self-contained rather than being split into metrics reporter.

Without the help of the CC metrics reporter for capacity information, a custom BrokerCapacityConfigResolver would be dependent on a hardware resource management system for this information. Having the capacity information gathered by CC metrics reporter helps Cruise Control be more self sufficient! Since the metric reporter already gathers CPU utilization information, wouldn't it make sense to gather CPU capacity information as well, especially since it gives the CPU utilization values more context when compared across hosts?

allow_capacity_estimation has a different use today -- it checks whether a broker capacity can be estimated from other brokers in the cluster in case its capacity information is missing. To ensure backwards compatibility, I'd suggest maintaining the existing behavior.

Understood! Maybe a new flag could be created or the reported capacity could be the default value used!

[1] https://github.com/kyguy/cruise-control/commit/a5385cbbefcd5602c6c8f142a02f6a40c89d603d

kyguy avatar Apr 09 '22 21:04 kyguy