logstash icon indicating copy to clipboard operation
logstash copied to clipboard

Add support for clustering Logstash instances

Open suyograo opened this issue 9 years ago • 29 comments

Today, each Logstash instance is a full pipeline -- inputs, filters and outputs stages. In large-scale Logstash deployments, users run multiple instances of Logstash in order to horizontally scale event processing. This requires manual management of individual configuration files, or custom/3rd party configuration automation tools such as Puppet or Chef.

We plan to introduce a concept of a Logstash cluster, where instances can be controlled as a whole (on a cluster level), instead of being separate parts. This would entail the following features:

  1. Provide an option to centrally store a Logstash config, which is shared across all the instances in the cluster. This would be the single source of truth for all the instances
  2. Provide APIs to control the cluster, dynamically, to change configuration. See #2612
  3. Provide APIs to monitor instances at the cluster level. See #2611

Logstash can still be started in a single-instance, non-clustered mode; file based configuration will continue to work.

Clustering instances will also provide the necessary groundwork for potential long-term enhancements like automatic load balancing, failover, running multiple pipelines and so on.

suyograo avatar Feb 17 '15 21:02 suyograo

For item 1 above, maybe design it so that everything talks through an "ConfigurationStore" abstraction which is implemented via plugins? I.e. so it could be extensible, supporting different implementations of where the "gold copy" of configuration is actually persisted and changes propagated to/from. Have different impls (i.e. ES itself, zookeeper etc)

bitsofinfo avatar Mar 12 '15 19:03 bitsofinfo

@suyograo should this ticket also mention plan for encryption support or should it be done separately ? Some background in: https://logstash.jira.com/browse/LOGSTASH-428 https://logstash.jira.com/browse/LOGSTASH-918

wiibaa avatar Mar 19 '15 12:03 wiibaa

What about items that are cached in LS instance now, like the data for the elapsed{} filter?

webmstr avatar Jul 07 '15 17:07 webmstr

@bitsofinfo thats exactly our thinking...we'll make it pluggable, so its easy to add in an alternate implementation for a config store. The first implementation will use ES as a config store.

suyograo avatar Aug 04 '15 22:08 suyograo

What's the recommended workaround now, specially if you need several logstash instances to monitor one folder (with several files)? As sincedb files are not shared among the instances, it's a pain in the ass to edit each .conf manually using excludes. Plus I also believe sincedb files are not recommended to be shared (one file for all) ATM as there is no concept of exclusive read/write access. Hints?

splashx avatar Aug 13 '15 07:08 splashx

I would have a single machine which mounted and sent the files via logstash-forwarder or beaver, to a load-balanced group of logstash machines.

blysik avatar Aug 13 '15 18:08 blysik

Hey @suyograo, thanks for working on supporting clustering for Logstash. The current documentation already seems to imply that it's already available:

Alternately, increase the Elasticsearch cluster’s rate of data consumption by adding more Logstash indexing instances.

https://www.elastic.co/guide/en/logstash/current/deploying-and-scaling.html

Does this refer to the same issue? Thanks!

gh-amistry avatar Aug 14 '15 22:08 gh-amistry

@gh-amistry: The documentation is not meant to imply the availability of clustered Logstash instances. The preceding paragraphs in that text describe a setup where multiple Logstash instances can pull messages from a message queue. That is already available, but each Logstash instance in such a setup is independent and doesn't share any state or configuration with other instances.

magnusbaeck avatar Aug 15 '15 08:08 magnusbaeck

@blysik that's a good workaround, until beaver/logstash-forwarder becomes the bottleneck - you'll have to stop the process, launch a second instance, create excludes and split the load. VERY not friendly.

splashx avatar Aug 15 '15 15:08 splashx

Thanks @magnusbaeck for the clarification. Our goal is to have Logstash instances pulling from different topics in Kafka (topics may have different input formats), then have the outputs go to the same ElasticSearch cluster. Will this issue address this type of Logstash scalability?

gh-amistry avatar Aug 17 '15 18:08 gh-amistry

Our goal is to have Logstash instances pulling from different topics in Kafka (topics may have different input formats), then have the outputs go to the same ElasticSearch cluster. Will this issue address this type of Logstash scalability?

This is possible already. I don't see how Logstash clustering support would help, really.

magnusbaeck avatar Aug 18 '15 03:08 magnusbaeck

The metric filter could become a problem in a cluster since it only counts within its own context. I can see three solutions to the problem, there are probably many more solutions I'm overlooking.

  1. The cluster stores and replicates all metric values within the cluster.
  2. The cluster makes sure that if a metric filter is used in a pipeline it will only start one instance of it.
  3. Warn users that the metric filter does not work in a cluster.

elvarb avatar Oct 02 '15 22:10 elvarb

The metric filter could become a problem in a cluster since it only counts within its own context

I do not anticipate this being a problem. The current designs of logstash cluster work will not have this problem because filter state (metrics and multiline filters, for example) is not shared among nodes.

jordansissel avatar Oct 02 '15 22:10 jordansissel

If we have two logstash instances processing http logs for a single http application, we will have two different metric results for the response codes. Or am I misunderstanding this?

elvarb avatar Oct 02 '15 23:10 elvarb

Are there plans to resolve this issue with logstash 2.x? What does the roadmap look like?

salyh avatar Oct 08 '15 08:10 salyh

Is there any progress about this issue in logstash 5.x?

gokhancamas avatar Oct 18 '16 10:10 gokhancamas

@gokhancamas This feature is unlikely to be added to Logstash before 6.0

untergeek avatar Oct 18 '16 15:10 untergeek

I am trying to understand how metrics filter will work in a logstash cluster. We are trying to decide whether we can have multiple logstash instances (as part of a cluster) for our application that is running in multiple pods or do we have to use just one logstash instance for metrics filter to work properly on log data from all app instances. If we use one logstash instance, scaling and availability becomes an issue.

rammulay avatar May 09 '17 19:05 rammulay

I would output the metrics from each logstash instance to statsd to combine them.

elvarb avatar May 09 '17 19:05 elvarb

@rammulay it is unclear if the metrics filter is even the right solution for measuring things going through Logstash. The metrics filter may not be necessary anymore now that we have stats APIs in Logstash.

jordansissel avatar May 09 '17 20:05 jordansissel

@elvarb thanks for your suggestion. I think it is safe to say that the metrics filter will not work across multiple logstash instances. @jordansissel are you referring to the node-stats-api? I am not sure how that is going to help me with aggregating and alerting based on application logs.

rammulay avatar May 09 '17 20:05 rammulay

@rammulay ahh, thats a good question. We had an offline discussion a few days ago about the future of the metrics filter (or rather, the use case, aggregating/alerting on log data), and we had some consensus that the right place to do this was with Elasticsearch aggregations, at least, maybe for a while. We have some ideas that may enable stream aggregations that work across logstash instances, but nothing is designed yet.

jordansissel avatar May 09 '17 20:05 jordansissel

@jordansissel you mean something like ElastAlert? The team that maintains our elasticsearch instance asked us not to use this for performance reasons (it is a shared)...hence looking at moving this upstream into logstash. We maintain our own logstash server(s).

rammulay avatar May 10 '17 00:05 rammulay

@rammulay another workaround would be to have one dedicated logstash input on one server that all the other logstash instances sends their metrics to. That way that single logstash input would be acting as a statsd server and would be combining them together.

Sadly though the metrics filter itself does not support sum https://www.elastic.co/guide/en/logstash/current/plugins-filters-metrics.html

So you would have to resort to writing your own plugin or using the ruby filter.

elvarb avatar May 17 '17 10:05 elvarb

Is this feature going to address the use case talked about here?

https://discuss.elastic.co/t/multiple-logstash-docker-containers-sharing-an-s3-input/36077

I'm hoping that this feature is going to create some sort of shared sincedb functionality (perhaps stored in ES) that would let us spread our S3 based inputs out horizontally. Or am I misinterpreting the focus of the issue here?

timothy-spencer avatar Aug 31 '17 23:08 timothy-spencer

Any update on this? At this point, it seems impossible to have more than one Logstash node running in parallel with the same JDBC input query against a given DB cluster. Seems like HA multi-node setup will be sending duplicate data downstream.

demisx avatar Mar 07 '19 23:03 demisx

It might be kind of neat to be able to use a clustered logstash as a means of doing pipeline distribution/scheduling, where the unit of work to distribute is a whole pipeline. We could also have the logstash cluster determine the optimal set of resources to dedicate to a given pipeline. Or maybe the user could provide that, similar to kubernetes resource constraints? (It would be sweet if logstash could know the resource quantity available for a given server and deny/refuse to schedule pipelines that would exceed that quantity).

As a separate idea, this might also result in some form of loadbalancing for network inputs and a distributed offset for things like database(JDBC) or datastore(s3) inputs.

I think the latter may be a bit simpler to implement than the former, since network loadbalancing would assume either logstash would know how to distribute traffic or a client would, whereas a distributed sincedb(offset) could theoretically be stored in elasticsearch like what @timothy-spencer suggested, and something like this might be able to get close to atomic-ish writes in elasticsearch for said sincedb?

It might be neat to model some of the scheduling/distribution ideas around some pre-existing systems in this realm like:

rwaweber avatar Jun 23 '19 04:06 rwaweber

Any new info on this request? Thanks!

jputman08 avatar Sep 18 '20 17:09 jputman08

Any new info on this request? Thanks!

insist93 avatar Aug 11 '22 02:08 insist93