logstash-filter-mutate
logstash-filter-mutate copied to clipboard
Add new "copy" operation
The mutate filter provides support for performing a host of different operations on a Logstash event fields, yet it is not easy to copy all fields of a sub-structure at the root level. For instance, 3rd-party systems sometimes produce events such as the following one:
{
"bla": "bla",
"meh": "meh",
"payload" : {
"foo" : "1224",
"bar" : "woohoo",
"baz" : 0,
"timestamp" : 1449356706000
}
}
And what you really want is to have the payload fields at the root level and potentially discard all root level fields, like this:
{
"foo" : "1224",
"bar" : "woohoo",
"baz" : 0,
"timestamp" : 1449356706000
}
In order to support this, one could use a ruby filter, but it'd be nice if the mutate filter could support this "copy" operation out-of-the-box, too. I've picked copy but potential alternate names could be extract, reify, move, promote.
The configuration of this new feature would look like this:
filter {
mutate {
copy => {
"field" => "payload"
"empty_root" => true
}
}
}
where:
fieldwould denote the event field (must be a Hash) whose content shall be copied at the root level (can also be a sprintf-style field)- if
empty_rootis true, all root-level fields would also be deleted in the process (defaults to false)
@consulthys I would advice changing the name of this action to be more expliclit, because this is not what a user would expect from a simple copy action at first sight, something like move_nested_field would be more clear, what do you think ?
Also I understand your use case is to move a nested field to root, but would you consider to allow to define a target config to move a field somewhere else in the event structure ?
by the way @ph I found #84 lying around, anything missing to get it merged ?
@wiibaa thanks for your input. I've also stumbled upon #84 just after publishing my PR, not sure why I missed it since the requirements are pretty similar... and the action name, too. I have no problems renaming it, as I said potential other names I came up with were extract, reify, move, promote, but move seems to be the one matching the requirement most closely, indeed.
I agree, it could be a good idea to add a target setting (defaulting to the root level if unspecified), which would make the action more generic and potentially serve other use cases. Then instead of empty_root we would have empty_target instead.
The new iteration will be then be changed to:
filter {
mutate {
move => {
"field" => "moved_field"
"target" => "target_field" # optional, defaults to root
"empty_target" => true # optional, defaults to false
}
}
}
This configuration will allow to move all sub-fields of moved_field to target_field (or to the root level if target_field is unspecified) and also erase all pre-existing fields in moved_field (or not if empty_target is false).
=> https://github.com/logstash-plugins/logstash-filter-mutate/pull/91/commits/70b89d4a856ee8bd8edd3915330d6b01b6fe1ea6
@wiibaa thanks for ping, If I ever go back to EU, I need to buy you some drinks 🍻
@wiibaa @consulthys I've looked at the intend of this change, I am wondering if it should be a new plugin completely?
I am saying that, because the current availables methods on the mutate plugin change an existing event but we always keep the original, the move/extract/reify is more we create a new event based on an existing event? Similar to the split filter.
As a side note, we had informal discussion in the team to split that plugin, for code reason, different concerns and since defining multiple actions in a single event wont offer any guarantee of ordering.
Thoughts?
@ph thanks for your insights and I perfectly understand your concern. If you look at the original thread, I was actually wondering if this new action should be encapsulated in a completely new plugin or whether an existing one like mutate would be a good host for it. Since the intent of this action is definitely to keep the original event, yet to morph it in some sense, as in "remove some noisy data" and/or reorg the fields, I decided to patch the mutate filter. I'm, however, perfectly OK to go another route if you prefer.
An example that comes to mind (among many others) would be the documents produced by beats. Most of them have a @timestamp, a beat, a counter, a type field and the real meat of the document resides inside another business-specific field that each beat names differently (what I named payload earlier). Sometimes you only care about what's inside of that business-specific field and you'd like to get rid of everything else (due to space concerns, indexing performance, what have you).
That's what this action is all about, so we're definitely looking at a change to the current event and not so much at creating/spawning a new one.
I didn't know that you guys were considering to split this plugin at some point, and if that's the case, then I agree it would make sense to directly create a new plugin for this, of course. Let me know how you would like me to proceed.
@consulthys @wiibaa I am OK to keep it in this plugin and when refactor the mutate filter can extract each logic into his own plugin, I am OK with the move, WDYT?
I didn't know that you guys were considering to split this plugin at some point.
It's a thought balance to have, new plugins, new code, new feature :)
Thanks @ph ! So the PR #91 should be good to go, right?
@consulthys I will do a code review, but I think we agree on the intend, thanks @consulthys for your work :)
@ph you must probably be migrating to SFO right now, but did you get any chance to check this out?
hello, i have this structure { "_index" : "temperature", "_type" : "logs", "_id" : "AVuPujP44HajvU84-3Mb", "_score" : 1.0, "_source" : { "TempValue" : 0.37377094412381984, "ID" : 27, "timestamp" : "2017-04-21T08:56" } },
but i would like to have
{ "_index" : "temperature", "_type" : "logs", "_id" : "AVuPujP44HajvU84-3Mb", "_score" : 1.0, "TempValue" : 0.37377094412381984, "ID" : 27, "timestamp" : "2017-04-21T08:56" },
Can you help me ?
Hi, any news about this?
filter { json { source => "message" } } is not a solution for copying nested json to root?
filter { json { source => "message" } }is not a solution for copying nested json to root?
if "message" is not A JSON string but a json object ,it can not work
just like if the message from filebeat to logstash is
"@timestamp": "2018-10-26T03:56:10.989Z",
"@metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.2.4"
},
"source": "/home/centos/software/sys-log/test-use-filebeat/filebeat-6.2.4-linux-x86_64/test.txt",
"offset": 5121,
"message": {
"pdl": "aa",
"fileName": "kuaikan-promote-service.log",
"additions": {
"message": "TouTiao Send Report for OneDayRetention "
},
"logPosition": "com.xm.promote.job.TouTIaoOneDayRetentionJob.lambda$run$0(TouTIaoOneDayRetentionJob.java:68)",
"env": "prod",
"serviceName": "promrvice",
"datetime": "2018-10-25 11:30:04:004",
"logLevel": "ERROR",
"threadInfo": "task-scheduler-2:63"
},
"fields": {
"app": "spider",
"type": "error",
"group": "search"
},
"prospector": {
"type": "log"
},
"beat": {
"name": "ch-01",
"hostname": "ch-01",
"version": "6.2.4"
}
}```
not like this
`{
"@timestamp": "2018-10-26T03:59:38.724Z",
"@metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.2.4"
},
"message": "{\"pdl\":\"kuaikan\",\"env\":\"prod\",\"serviceName\":\"promote-service\",\"fileName\":\"kuaikan-promote-service.log\",\"datetime\":\"2018-10-25 11:30:04:004\",\"logLevel\":\"ERROR\",\"additions\":{\"message\":\"TouTiao Send Report for OneDayRetention muid=null,error=java.lang.RuntimeException: java.lang.IllegalArgumentException: Illegal character in query at index 75: http://ad.toutiao.com/track/activate/?event_type=6\u0026muid=null\u0026os=0\u0026callback={},\"},\"threadInfo\":\"task-scheduler-2:63\",\"logPosition\":\"com.xm.promote.job.TouTIaoOneDayRetentionJob.lambda$run$0(TouTIaoOneDayRetentionJob.java:68)\"}",
"prospector": {
"type": "log"
},
"fields": {
"type": "error",
"group": "search",
"app": "spider"
},
"beat": {
"version": "6.2.4",
"name": "b-kk-stag-search-01",
"hostname": "b-kk-stag-search-01"
},
"source": "/home/centos/software/sys-log/test-use-filebeat/filebeat-6.2.4-linux-x86_64/test.txt",
"offset": 5121
}`
i need this refactor, but seems still not in 6.4
Is there a Logstash v6.4 ruby filter workaround?
Theses does not work (to move detail.* under root element):
ruby { // Triggers a: Ruby exception occurred: Direct event field references (i.e. event['field']) have been disabled in favor of using event get and set methods (e.g. event.get('field')). Please consult the Logstash 5.0 breaking changes documentation for more details.
code => "
event['detail'].each {|k, v|
event[k] = v
}
event.remove('detail')
"
}
ruby { // Drops the event (output receives nothin)
code => "
event.get('detail').each {|k, v|
event.set(k, v)
}
event.remove('detail')
"
}
ruby { // No change to event
code => "
event.to_hash.delete_if {|k, v| k != 'detail'}
event.to_hash.update(event.get('detail').to_hash)
event.to_hash.delete_if {|k, v| k == 'detail'}
"
}
Need! What's the problem merging this PR?
You can workaround this by serialising to json, and then parsing it into fields on the root
json_encode {
source => "json"
target => "json_string"
}
json {
source => "json_string"
}