logstash-filter-mutate icon indicating copy to clipboard operation
logstash-filter-mutate copied to clipboard

Ability to Tag Percentage of Events Similar to Drop Filter's Percentage Function

Open MakoWish opened this issue 2 years ago • 0 comments

I would like to fork a small percentage (~5%) of all production data into our development Elastic cluster, but I am having a hard time finding a way to do that. Using a clone/fork pattern in pipelines.yml could accomplish this, but since that would first require cloning all events entering the pipeline (then use the drop {} filter plugin on the DEV fork), it would create quite a large toll on server resources.

I was hoping to find an existing way to simply tag a percentage of events so I could use conditionals to output that small fraction to DEV. Something along the lines of:

input {
  ...
}

filter {
  mutate {
    tag_percentage => {
      percentage => 5
      value => "fork_to_dev"
    }
  }
}

output {
  # Send all data to PRD
  elasticsearch {
    hosts => ["elasticprd.contoso.com:9200"]
  }

  # Send our small percentage of tagged events to DEV
  if "fork_to_dev" in [tags] {
    elasticsearch {
      hosts => ["elasticdev.contoso.com:9200"]
    ]
  }
}

This would completely eliminate the need to clone any events, and the only additional resources consumed would essentially just be the output to the development cluster. Unfortunately, the only "percentage" function I am aware of is with the drop {} filter, but that does not help our situation.

Would love to see something like this implemented, and would also love to hear any feedback/suggestions.

Thank you, Eric

MakoWish avatar May 20 '22 17:05 MakoWish