tlog
tlog copied to clipboard
another playback issue with Elasticsearch
Hello,
we are using latest fedora tlog package (tlog-6-1.fc30.x86_64) and we get the following error while trying to get tlog-play working with elasticsearch 5.6.16
A message field is missing
Failed reading the source at message #0
Here is a JSON entry that we are receiving from ES using plain curl call:
[ #]: curl -XPOST "https://tlog:[email protected]:9200/_search?q=rec:8cac5b0993fc4bf4b6dbd00fd73c87c3-7e34-16be6ed6&pretty"
{
"took" : 28,
"timed_out" : false,
"_shards" : {
"total" : 58,
"successful" : 58,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 3.205453,
"hits" : [
{
"_index" : "graylog_17",
"_type" : "message",
"_id" : "38287241-9e5e-11e9-8fd1-661ef139ad96",
"_score" : 3.205453,
"_source" : {
"collector_node_id" : "fra-test-bproxy-01",
"gl2_remote_ip" : "172.16.1.4",
"session" : 12025,
"gl2_remote_port" : 34230,
"in_bin" : [ ],
"source" : "fra-test-bproxy-01.inatec.local",
"gl2_source_input" : "5d10c887db412567534abad4",
"rec" : "8cac5b0993fc4bf4b6dbd00fd73c87c3-7e34-16be6ed6",
"pos" : 0,
"host" : "fra-test-bproxy-01.inatec.local",
"gl2_source_node" : "191406ac-0f8a-490a-9689-cbc492f013e0",
"term" : "xterm-256color",
"id" : 1,
"out_bin" : [ ],
"timestamp" : "2019-07-04 13:18:31.000",
"ver" : "2.2",
"gl2_source_collector" : "81bfd4cb-eb44-4fe8-bc01-8e373427b46b",
"timing" : "=117x31+3>108+1061>1+299>1+140>1+200>1+103>1+367>1+792>113+1359>6",
"streams" : [
"5d1a1aeadb4125675354d46d"
],
"SourceName" : "tlog-rec",
"message" : "{\"ver\":\"2.2\",\"host\":\"fra-test-bproxy-01.inatec.local\",\"rec\":\"8cac5b0993fc4bf4b6dbd00fd73c87c3-7e34-16be6ed6\",\"user\":\"root\",\"term\":\"xterm-256color\",\"session\":12025,\"id\":1,\"pos\":0,\"timing\":\"=117x31+3>108+1061>1+299>1+140>1+200>1+103>1+367>1+792>113+1359>6\",\"in_txt\":\"\",\"in_bin\":[],\"out_txt\":\"\\u001b[38;5;11mroot\\u001b[38;5;15m@\\u001b[38;5;196mfra-test-bproxy-01\\u001b[38;5;15m:\\u001b[38;5;6m[\\u001b[38;5;76m~\\u001b[38;5;6m]:\\u001b[38;5;15m echo A\\r\\nA\\r\\n\\u001b[38;5;11mroot\\u001b[38;5;15m@\\u001b[38;5;196mfra-test-bproxy-01\\u001b[38;5;15m:\\u001b[38;5;6m[\\u001b[38;5;76m~\\u001b[38;5;6m]:\\u001b[38;5;15m exit\\r\\n\",\"out_bin\":[]}",
"EventReceivedTime" : "2019-07-04 15:18:31",
"out_txt" : "[38;5;11mroot\u001B[38;5;15m@\u001B[38;5;196mfra-test-bproxy-01\u001B[38;5;15m:\u001B[38;5;6m[\u001B[38;5;76m~\u001B[38;5;6m]:\u001B[38;5;15m echo A\r\nA\r\n\u001B[38;5;11mroot\u001B[38;5;15m@\u001B[38;5;196mfra-test-bproxy-01\u001B[38;5;15m:\u001B[38;5;6m[\u001B[38;5;76m~\u001B[38;5;6m]:\u001B[38;5;15m exit",
"user" : "root"
}
}
]
}
}
As you can see the message field is provided and contains a JSON message (converted to string). Any help is highly appreciated
P.S. We have tried to compile the latest tlog version on Debian stretch and tlog-play is failing with "Out of Memory" error while playing form elastic. It works perfectly on the same host if playing from file or journal.
I can see that although the original "message" has all the fields, the parsed data under "_source" is missing the "in_txt" field for some reason. That field is required by tlog-play. I wonder if something, maybe Graylog, or particular ElasticSearch ingestion settings, are dropping fields with empty string values.
We need to improve those error messages and the "Out of Memory" is troublesome news. Could you post separate issues for those two problems, please? Otherwise they'll be forgotten.
Thank you for pointing me out on missing in_txt field. I will check by graylog why it is failing to get this field form the message (where the field is provided).
I have added issue for the "Out of Memory" problem.
Ok. I can confirm if I manually add in_txt field to the message then playback is working as expected. And graylog is not inserting this field because its empty ("in_txt":""). So it would be great if tlog can handle this situation either by adding placeholder to the variable if there is no text inside or assuming this variable to be present and empty if it cannot find it in ES output.
Can you persuade Graylog to add it regardless?
I have checked different possibilities, but it looks like it is general problem. Graylog is using dynamic template to store different log fields into elasticsearch index.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
"Dynamic field mappings are only added when a field contains a concrete value — not null or an empty array. This means that if the null_value option is used in a dynamic_template, it will only be applied after the first document with a concrete value for the field has been indexed."
This means that empty field could not be stored in this way...
Sorry, I'm a bit rusty on Elasticsearch, but can you work this around by creating an Elasticsearch mapping before starting logging? We have one in doc/mapping.json
.
I have tried to create separate index for tlog data with custom mapping but looks like its general graylog problem. If I insert data directly to elasticsearch everything is Ok. But graylog does not even try to create empty string fields. I have opened feature request by graylog2-server github project, but do not think they can resolve it quickly.
I see. Thank you for trying this out. Looks like tlog might need to handle the missing fields when reading from Elasticsearch.
@inatec-dh this should be fixed in the latest release, can you confirm?
I want to use tlog in ELK as well as part of our SIEM. Would be really great if there was a way to parse tlog recordings in ELK.