logstash-filter-verifier icon indicating copy to clipboard operation
logstash-filter-verifier copied to clipboard

metadata fields

Open youngmip opened this issue 8 years ago • 12 comments

Hi,

I have a quick question! Can I test metadata fields as well? In our use case metadata fields are used to determine an index name.

Many thanks! Youngmi.

youngmip avatar Apr 15 '16 10:04 youngmip

I'm afraid this isn't possible at the moment since Logstash's json_lines codec can't be configured to include the metadata fields in the serialized JSON string. I've filed https://github.com/logstash-plugins/logstash-codec-json_lines/issues/23 to get this fixed and was nearly done with the pull request when I hit a bump in the road (https://github.com/elastic/logstash/issues/5130) that I need to sort out first.

magnusbaeck avatar Apr 17 '16 19:04 magnusbaeck

Today I asked my self how hard it would be to convert the rubydebug output back to valid json. On a first try, it looks like this would not be that hard. @magnusbaeck please have a look at https://github.com/breml/rubydebug One remaining problem would be to find the end of an event. With the json_lines codec this is very easy, as every event is exactly one line. This would not be the case for the rubydebug codec. On the other hand, due to the "nice" formatting, every event ends with a line only containing a }.

breml avatar Jan 19 '17 20:01 breml

@magnusbaeck What are your thoughts about this issue? Do you intend to rebase https://github.com/logstash-plugins/logstash-codec-json_lines/pull/24 or do you think we should implement something into LFV to process the rubydebug output?

breml avatar May 08 '17 13:05 breml

@breml I worked on the rebase tonight but it fell through since the specs are currently broken on master (see https://travis-ci.org/logstash-plugins/logstash-codec-json_lines/builds/226973335). Once that has been addressed it shouldn't take long to get the PR into shape. I'd much prefer that route over writing (and maintaining) a rubydebug parser.

magnusbaeck avatar May 15 '17 19:05 magnusbaeck

@magnusbaeck Today I stumbled again over this issue. We store the decision, into which Elasticsearch index a particular event should be stored in a field in the @metadata array. Now in the test cases we are not able to test, if this decision is made correctly, because the @metadata is not available. Two years ago we hoped, that a change in the logstash-codec-json_lines would be merged. Until now, this is not the case. What do you think about this?

breml avatar Mar 21 '19 15:03 breml

Argh, this old problem. I'd forgotten that the ball was in my court. Let me have the weekend to look into a rebase.

magnusbaeck avatar Mar 21 '19 20:03 magnusbaeck

The PR has been updated but there are existing problems that prevents the Travis build from being green so I'm guessing nobody's going to be jumping onto the review. I've asked about the failure in the Slack channel.

magnusbaeck avatar Mar 24 '19 21:03 magnusbaeck

@magnusbaeck Ping me, if there is something I can help with.

breml avatar Mar 27 '19 06:03 breml

@magnusbaeck Today I tested a different approach to workaround this problem and the solution is surprisingly easy. Since February 2017 (https://github.com/logstash-plugins/logstash-filter-mutate/commit/a1137c9fc4ca41069235b4f6e2f752563c39bb01#diff-091009b2789afad0d76a6df256ba6ff3) the mutate filter provides support for copying fields. So today I tried the following filter in my Logstash config:

filter {
  mutate {
    copy => {
      "[@metadata]" => "[__metadata]"
    }
  }
}

This worked just fine, all the nested fields within [@metadata] are copied to [__metadata], which then is provided by the json_lines codec.

As a convenience function logstash-filter-verifier could provide a flag to add this filter at the end of the filter config under test, if someone likes to get the @metadata fields. With this the user does not need to modify his filter config prior to the testing with logstash-filter-verifier.

What do you think?

breml avatar Apr 03 '19 13:04 breml

We then could also remove the special field __lfv_testcase which gets added by logstash-filter-verifier (I think this only the case if used with sockets).

breml avatar Apr 03 '19 13:04 breml

That's an excellent workaround. Even if the PR we've been waiting for gets merged it'll take a while before it ends up in the Logstash releases people use (unless they specifically upgrade that plugin) so this workaround is better from a compatibility point of view. When doing this we should rename __metadata back to @metadata before passing the map to the chosen diff tool.

magnusbaeck avatar Apr 07 '19 12:04 magnusbaeck

Glad you like the workaround. For now, I have extended the above mentioned snippet to also remove the __lfv prefixed data from __metadata:

filter {
  mutate {
    copy => {
      "[@metadata]" => "[__metadata]"
    }
  }
  ruby {
    code => 'metadata = event.get("__metadata")
             metadata.delete_if {|key, value| key.start_with?("__lfv") }
             event.set("__metadata", metadata)'
  }
}

As long as only __lfv_testdata is used, this field could also be removed with a simple mutate filter.

breml avatar Apr 08 '19 11:04 breml