lnav icon indicating copy to clipboard operation
lnav copied to clipboard

Automatic format for json log files

Open tvld opened this issue 6 years ago • 22 comments

First time install lanv and bit shocked it cant show very simple log format:

{"time":"2019-04-11 15:13:17", "level":"alert", "msg":"Hello"} 
{"time":"2019-04-11 14:33:58", "level":"error", "msg":"Connection failed"}

I think it's an omission that you can not browse this file in a nice way without needing to learn complex format file with regex stuff and others tricks? Or am I missing something and is there some button to simply browse json files?

tvld avatar Apr 15 '19 08:04 tvld

Is there a service that outputs this format of JSON logs? If so, it can be added as a default format.

with regex stuff

The format file for a JSON log does not require any regexes.

It's not handled automatically because lnav would have to make a lot of guesses as to the meaning of fields. And, at the end of the day, a format file can be created without too much trouble and shared widely. So, there's not too much incentive to write complicated code to try and guess what's what.

I would say this is more of a request to make it easier to create format files.

tstack avatar Apr 15 '19 14:04 tstack

@tstack I think default JSON formatting would be useful if the log is detected in JSON format and no formatter is defined - for example just printing every field value with a predefined separator, starting with time and level if found. Eg.

2019-04-11 15:13:17|alert|Hello|Field 2 value|Field n value
2019-04-11 14:33:58|error|Connection failed|Field 2 value|Field n value

I think this would also be useful to give new users a starting point to define their own formats, it can be involved starting from scratch with JSON logs as there are not a lot of examples.

@tvld make the following json file below, lnav -i path/to/file.json and then ensure your log file name ends in .json or change the file-pattern field accordingly.

{
  "json_log": {
    "title": "JSON log",
    "description": "Generic JSON log",
    "json": true,
    "file-pattern": ".*\\.json$",
    "level-field": "level",
    "timestamp-field": "time",
    "value": {
      "msg": {
        "kind": "string"
      }
    },  
    "line-format": [
      {
        "field": "time"
      },
      "|",
      {
        "field": "level"
      },
      "|",
      {
        "field": "msg"
      }
    ]
  }
}

s1monj avatar Apr 15 '19 15:04 s1monj

@s1monj Super... thank you :-) I have it working now with :

{
  "json_log": {
    "title": "JSON log",
    "description": "Generic JSON log",
    "json": true,
    "file-pattern": ".*\\.json$",
    "level-field": "level",
    "level" : {
      "critical" : "error",
      "error" : "warning",
      "warning" : "alert",
      "info": "info"
    },

    "timestamp-field": "timestamp",
    "value": {
      "msg": {
        "kind": "string"
      }
    },
    "line-format": [
      {
        "field": "timestamp"
      },
      " ",
      {
        "field": "level",
        "min-width": 6
      },
      {
        "field": "msg"
      }
    ]
  }
}

That said, I might not have been able to do this without your help... so some json example in the doc and a jumpstart for standard json files would have been great )))

tvld avatar Apr 15 '19 16:04 tvld

I think this is a useful feature.... Just looking into macOS unified logging. It can stream out in json format so a default reader for that would be great. As a small detail it time stamps in universal time format e.g. for me I get both of the following depending if user or system context: 12:01:02 34.567890+0000 or 13:01:02 34.567890+0100 does, or CAN, lnav handle that? (Rather than flagging lots of time sequence errors.)

cw1nte avatar May 24 '19 13:05 cw1nte

Follow up on https://twitter.com/will_sargent/status/1147666231429779456

Here's a gist with the log and the logstash.json format https://gist.github.com/wsargent/9bfc52badcaa1170488bb66e45c8d778

wsargent avatar Jul 07 '19 20:07 wsargent

@wsargent It looks like the log messages are not on a single line, is that right? Currently, lnav requires each JSON log message to be on a single line.

So, instead of this:

{
  "id" : "Fa7lbgUZkrE6O0Qbm7EAAA",
  "sequence" : 1,
  "@timestamp" : "2019-07-06T21:22:43.879+00:00",
  "@version" : "1",
  "message" : "I like to do stuff",
  "logger_name" : "example.Main$Runner",
  "thread_name" : "pool-1-thread-1",
  "level" : "INFO"
}
{
  "id" : "Fa7lbgUZkrQ6O0Qbm7EAAA",
  "sequence" : 2,
  "@timestamp" : "2019-07-06T21:22:43.882+00:00",
  "@version" : "1",
  "message" : "I am a warning",
  "logger_name" : "example.Main$Runner",
  "thread_name" : "pool-1-thread-1",
  "level" : "WARN"
}

Need this:

{"id":"Fa7lbgUZkrE6O0Qbm7EAAA","sequence":1,"@timestamp":"2019-07-06T21:22:43.879+00:00","@version":"1","message":"I like to do stuff","logger_name":"example.Main$Runner","thread_name":"pool-1-thread-1","level":"INFO"}
{"id":"Fa7lbgUZkrQ6O0Qbm7EAAA","sequence":2,"@timestamp":"2019-07-06T21:22:43.882+00:00","@version":"1","message":"I am a warning","logger_name":"example.Main$Runner","thread_name":"pool-1-thread-1","level":"WARN"}

tstack avatar Jul 07 '19 21:07 tstack

That’s kind of a problem given JSON can have variable formatting.

On Jul 7, 2019, at 2:31 PM, Tim Stack [email protected] wrote:

@wsargent It looks like the log messages are not on a single line, is that right? Currently, lnav requires each JSON log message to be on a single line.

So, instead of this:

{ "id" : "Fa7lbgUZkrE6O0Qbm7EAAA", "sequence" : 1, "@timestamp" : "2019-07-06T21:22:43.879+00:00", "@version" : "1", "message" : "I like to do stuff", "logger_name" : "example.Main$Runner", "thread_name" : "pool-1-thread-1", "level" : "INFO" } { "id" : "Fa7lbgUZkrQ6O0Qbm7EAAA", "sequence" : 2, "@timestamp" : "2019-07-06T21:22:43.882+00:00", "@version" : "1", "message" : "I am a warning", "logger_name" : "example.Main$Runner", "thread_name" : "pool-1-thread-1", "level" : "WARN" } Need this:

{"id":"Fa7lbgUZkrE6O0Qbm7EAAA","sequence":1,"@timestamp":"2019-07-06T21:22:43.879+00:00","@version":"1","message":"I like to do stuff","logger_name":"example.Main$Runner","thread_name":"pool-1-thread-1","level":"INFO"} {"id":"Fa7lbgUZkrQ6O0Qbm7EAAA","sequence":2,"@timestamp":"2019-07-06T21:22:43.882+00:00","@version":"1","message":"I am a warning","logger_name":"example.Main$Runner","thread_name":"pool-1-thread-1","level":"WARN"} — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

wsargent avatar Jul 07 '19 22:07 wsargent

That’s kind of a problem given JSON can have variable formatting.

I thought most/all JSON logs were line-oriented since not all JSON parsers can handle multiple JSON values. The most compatible format would allow the log processor to read a line to get the JSON value and then feed it to a parser.

Are you able to reconfigure logstash to output the JSON on a single line?

tstack avatar Jul 08 '19 04:07 tstack

I thought most/all JSON logs were line-oriented since not all JSON parsers can handle multiple JSON values. The most compatible format would allow the log processor to read a line to get the JSON value and then feed it to a parser.

Yes, this is NDJSON or jsonlines. The issue is:

  • Pretty printed JSON is a known logging format (logstash has both a json codec and a jsonlines codec, for example)
  • There's no error showing why this isn't "JSON" even when the "json": true field is set
  • There's no way to directly specify a format.
  • People (especially developers) are going to pretty print json logs and expect them to work the same.

wsargent avatar Jul 08 '19 20:07 wsargent

  • Pretty printed JSON is a known logging format (logstash has both a json codec and a jsonlines codec, for example)

It may be known, but jsonlines is going to be more compact and compatible with the most consumers. So, it seems more likely to be used. Given that, supporting pretty-printed JSON is a low priority compared to other stuff and given my limited time. I won't reject a pull-request for it, but I'm not going to spend time on it. (To be clear, I'm not talking about the original request in this bug, which I do think is a priority.)

  • There's no error showing why this isn't "JSON" even when the "json": true field is set

Displaying why any log format could not recognize a file would be a good thing in general.

  • There's no way to directly specify a format.

This is not the issue. lnav works on a line-by-line basis. Enhancing it to deal with JSON that crosses line-boundaries is the challenging part.

  • People (especially developers) are going to pretty print json logs and expect them to work the same.

lnav can already format the JSON into something more readable than pretty-printed JSON. So, there is no reason to have the log files themselves be pretty-printed.

tstack avatar Jul 12 '19 15:07 tstack

Jumping in this conversation : I think it would be useful to support the GELF format [1] It's a json log format used by grelog and other. I think it should be easy to add it to the supported default formats.

[1] : https://docs.graylog.org/en/3.2/pages/gelf.html

abate avatar Apr 09 '20 23:04 abate

Cloudtrail (AWS) is another example of json formatted logs, would be really good to have this.

https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-event-reference.html

B0073D avatar Jun 21 '22 01:06 B0073D