seqcli icon indicating copy to clipboard operation
seqcli copied to clipboard

`seqcli ingest` extract embedded JSON values from plain text logs

Open nblumhardt opened this issue 3 years ago • 1 comments

Discussed in https://github.com/datalust/seq-tickets/discussions/1536

Originally posted by mkvonarx March 31, 2022 Hi

We're using serilog for our logging. Unfortunately, we cannot user the CLEF JSON format in production. Instead we use a more human readable format with # to delimit the values and a {Properties} at the end to also output all remaining Serilog properties. Our format string looks almost like this: "outputTemplate": "{Timestamp:yyyy-MM-dd HH:mm:ss.fff} # {ThreadId,3} # {Level:u3} # {SourceContext,-30} # {Message:lj} # {Properties} # {Exception}${NewLine}"

We're trying to ingest these logs to Seq with "seqcli ingest -x" and it works mostly fine. The only thing we cannot manage to nicely ingest is the {Properties} part. This is a JSON inside the human readable text, and seqcli ingest does not have a JSON parser according to https://docs.datalust.co/docs/command-line-client#extraction-patterns

Is there any way to ingest the {Properties} with seqcli ingest as anything else than a string?

Having a 5th builtin property like e.g. @p that directly understands JSON like {Properties} (like @t etc.) would propably be the nicest solution. Or maybe something like "seqcli ingest -x ".... {:json} ..." could also work.

nblumhardt avatar Apr 01 '22 03:04 nblumhardt

https://github.com/datalust/superpower/blob/dev/sample/JsonParser/Program.cs contains a token-based JSON parser using the framework also used for extraction patterns. The text-based string and number parsers there handle the thorniest aspects of properly parsing JSON, though - creating a text-based JSON matcher that builds on them should be easy enough.

The matcher would probably need to disallow leading whitespace, to avoid partial matches on non-JSON content.

nblumhardt avatar Apr 01 '22 08:04 nblumhardt