log4j2-logstash-layout icon indicating copy to clipboard operation
log4j2-logstash-layout copied to clipboard

Streamable interface for SAX-style message rendering

Open mkedwards opened this issue 6 years ago • 10 comments

A step towards the message streaming interface proposed in #17.

mkedwards avatar Oct 10 '18 03:10 mkedwards

I believe this PR is resolved by other means in #17, hence I am closing it. Feel free to poke me if you think otherwise.

vy avatar Nov 23 '18 20:11 vy

Not entirely. It's still useful to have a SAX-style interface by which the Layout can stream JSON events to the Appender.

mkedwards avatar Nov 23 '18 20:11 mkedwards

@mkedwards, ok, then I am re-opening it.

vy avatar Nov 23 '18 20:11 vy

I've rebased in the nexiahome:v0.15-streamable branch, which has all the changes that we're running with locally. Would you like me to pick each out as a separate PR?

mkedwards avatar Nov 23 '18 20:11 mkedwards

I found these two particularly useful:

The rest looks to be nexia-specific to me. Am I missing something else?

vy avatar Nov 23 '18 20:11 vy

The Streamable interface is intended to be generally useful, especially if it's adopted by log4j2 upstream. See https://github.com/nexiahome/formatted_data_message/blob/logstash-streamable/src/main/java/org/apache/logging/log4j/message/lazy/FormattedDataMessage.java#L552 for how we implement streamTo(); I think it's very much in the spirit of the gc-free model.

mkedwards avatar Nov 23 '18 20:11 mkedwards

But now that you have the JsonGenerator streaming mechanism in the LogstashLayout resolver, I think you want https://github.com/vy/log4j2-logstash-layout/commit/a87707fa5e264b59bc7f05b6e1ffa3739d86390b instead of the implementation in this PR.

mkedwards avatar Nov 23 '18 20:11 mkedwards

That, plus the two commits you identified, are indeed all the substance of that branch.

mkedwards avatar Nov 23 '18 20:11 mkedwards

And what I meant was that it's still useful to have a SAX-style interface by which the Message can stream JSON events to the Layout. 😝

mkedwards avatar Nov 23 '18 20:11 mkedwards

Again, context in case it's helpful: we're using this together with some DynamicThresholdFilter machinery that lets us selectively log at debug/trace level for devices/sessions of particular interest. Some of the "tags" we add to log messages are expensive to render, so we want to lazify those as much as possible. Lambda captures are (relatively) cheap, as are the immutable Maps created by Map.ofEntries(). So a trace log might look like this:

          logger.trace(
              MARKER_logMessageFromDevice,
              () ->
                  new FormattedDataMessage(
                      "logMessageFromDevice@15749",
                      "Received message from %(connection_id) (%(device_id)):\n%(message_string)",
                      "DEVICE_MESSAGE",
                      Map.ofEntries(lazy("message_string", message::toString))));

If we can determine from the log-site-specific marker (using MarkerFilter) or the MDC content (using DynamicThresholdFilter) that we don't want to log this line, then the lambda never gets executed at all, and neither the FormattedDataMessage nor any of the arguments to its constructor get created. If it passes the marker and MDC checks, then the lambda gets executed, constructing the FormattedDataMessage; but the message::toString still doesn't get called yet. The FormattedDataMessage implementation is free to use other configuration/context information, as late as the call to getFormattedMessage() (or streamTo()), to decide whether to actually render the value of the message_string tag. (Note also the availability of keys from the MDC, like connection_id and device_id, for interpolation into the human-readable version of the log message.)

This is of course considerably more verbose at the log site than I'd like it to be. We're just in the process of introducing Markers to our code base; the Marker is effectively a lookup key for the log level, message ID, and message type. So I expect to provide syntactic sugar that looks like:

          log(
              ERROR_channelRead_noConnInMap,
              "Received message for missing connection %(bogus_connection_id):\n%(message_string)",
              entry("bogus_connection_id", connectionId),
              lazy("message_string", message::toString));

Here log() will be a static method imported from FormattedDataMessage that takes a (globally registered) Marker, a template string, and a varargs list of Map.Entry<String, ? extends Object> instances (see https://github.com/nexiahome/formatted_data_message/blob/logstash-streamable/src/main/java/org/apache/logging/log4j/message/lazy/LazyMap.java#L28), and wraps it up in a degree of laziness similar to the verbose example above.

mkedwards avatar Nov 23 '18 22:11 mkedwards