beats icon indicating copy to clipboard operation
beats copied to clipboard

Kafka: Can we specify a single broker to represent all the brokers in a cluster?

Open lalit-satapathy opened this issue 2 years ago • 3 comments

Problem:

The issue is around how brokers are specified to the kafka metricset in a multi-broker cluster setup. In a cluster setup, currently we need to specify all the leader brokers to completely fetch all the metrics. In particular, some of the partition metricset requires that all leader brokers to be specified. The challenge is for clusters with a large(10+) number of brokers, providing all the brokers is an extra burden for the users and does not scale for large number of brokers.

Implementation details:

Kafka brokers in a cluster, share the metadata and single broker should be able to pull all the available brokers automatically. The current implementation already has the metadata for all the brokers. This metadata is being used for validating the broker hosts provided by the user. Can the same be used for auto-populating all the brokers, if a single broker is provided? This may require significant rework, since the kafka Metricset seems to be designed to work with a single broker. Plan for this enhancement, along with other kafka enhancements, when kafka module gets re-architected in future.

CC: @rameshelastic @ruflin @jsoriano

lalit-satapathy avatar Dec 15 '22 06:12 lalit-satapathy

This is by design. The idea is that every monitored host has an instance of Metricbeat. In clusters with several brokers in different hosts, if each one had an instance of Metricbeat and each instance of the module collected information of the whole cluster, the information collected would be duplicated as many times as Metricbeat instances were running. So whenever possible, each instance collects information only about their local broker.

This is generalized in Metricbeat modules, in principle an instance of a module collects information about its local node only. Apart of the reason of avoiding duplicating information, it also has a reason of scale. In very big clusters, it is usually better to spread the load between more nodes, instead of centralizing all the collection in a single instance.

This discussion also appeared at least in the Elasticsearch module and in Kubernetes autodiscover. These features also collect information of their local nodes only, although most of the information of the cluster is available through the same APIs, and could be monitored from a central place. What we did there to support centralized collection was to add an scope option, that defaults to node, but can be also set to cluster. Find here the related docs for Elasticsearch for example: https://www.elastic.co/guide/en/beats/metricbeat/8.5/metricbeat-module-elasticsearch.html#_module_specific_configuration_notes_6 And for Kubernetes autodiscover: https://www.elastic.co/guide/en/beats/metricbeat/8.5/configuration-autodiscover.html#_kubernetes

To support centralized collection in the kafka module, I would suggest to do the same, to add an scope option, that defaults to node (or broker if we want to follow kafka terminology), but can be set to cluster for centralized collection. This would avoid breaking changes, while would continue supporting both use cases.

jsoriano avatar Dec 15 '22 09:12 jsoriano

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

botelastic[bot] avatar Dec 15 '23 10:12 botelastic[bot]

I've submitted https://github.com/elastic/integrations/pull/9260 to highlight this "limitation" for users.

lucabelluccini avatar Mar 04 '24 09:03 lucabelluccini