flogger icon indicating copy to clipboard operation
flogger copied to clipboard

Refactoring SimpleMessageFormatter

Open cslee00 opened this issue 4 years ago • 4 comments

Hey folks,

Starting to contemplate untangling the various responsibilities in com.google.common.flogger.backend.SimpleMessageFormatter, will fork/branch here (no code created there yet): https://github.com/cslee00/flogger

From an initial review SimpleMessageFormatter has the below responsibilities; expecting that additional responsibilities will be identified as refactoring progresses.

  1. Static utility methods to assist with message formatting;
  2. Message "building" responsibilities inherited from MessageBuilder
  3. Message argument handling responsibilities from ParameterVisitor interface implementation
  4. Creating/formatting contextual information (tags & additional keys)
  5. Dispatch created message to SimpleLogHandler interface
  6. Orchestration of all the above

The tight coupling of these responsibilities prohibits composing solutions that use parts of these capabilities; for example, formatting the message template (w/ arguments) as text but passing all the other data (throwable, tags, etc) along to a back-end to handle.

Goal is to break apart this monolithic class into a few pieces with specific responsibilities such that they can be composed/orchestrated in different manners.

Thoughts / considerations welcome!

cslee00 avatar Mar 17 '20 17:03 cslee00

Sounds like a plan. Note that SimpleMessageFormatter was never intended to be more than it is. It was always known not to be a great general purpose solution (hence the name) so rather than refactor it, I'm more inclined to just make a better replacement for it. It's a shame all the backends decided to use it (I assume for convenience) rather than considering what the best option for them is.

I'd rather have semi-duplicated code in different backends to handle backend formatting expectations than try and create the perfect all singing & dancing formatter for everyone.

Originally formatting was a "frontend" concern and that was a mistake, so I worked hard to move it out to the backend (since a backend might not even do formatting at all and just ship the data out in a protocol buffer or similar). So whatever improvements we come up with need to not assume that a backend must format things (e.g. by adding format specific code to common backend APIs).

A few well placed static helpers might be fine to share/support long term, but each backend could just live with it's own completely separate formatter as far as I'm concerned.

hagbard avatar Mar 24 '20 12:03 hagbard

Thanks for the context. Agree that formatting is a back-end concern and re-use (if any) should be pick-from-available-pieces on a best-effort basis (to make those pieces reusable).

Possible to provide context around the MessageParser constructs ([Default]PrintfMessageParser & DefaultBraceStyleMessageParser)? Presuming these were created for a variety of reasons: a) to strategize the message format as part of migration, as existing code (from other loggers) may have used different formats; b) performance/memory considerations over alternatives such as java.util.Formatter; c) improved error-handling/diagnostic semantics (as this is diagnostic code). That's non-trivial code for Flogger back-ends to duplicate/reproduce where they have a need to format the message.

Curious as to how Google back-ends handle formatted messages (e.g. logger.atInfo().log("My name is %s", name)) in the context of structured logging - is the message formatted and then part of the structure (e.g. JSON w/ discrete 'message' field), are the arguments formatted separately, or ...?

Chris.

On Tue, 24 Mar 2020 at 05:39, David Beaumont [email protected] wrote:

Sounds like a plan. Note that SimpleMessageFormatter was never intended to be more than it is. It was always known not to be a great general purpose solution (hence the name) so rather than refactor it, I'm more inclined to just make a better replacement for it. It's a shame all the backends decided to use it (I assume for convenience) rather than considering what the best option for them is.

I'd rather have semi-duplicated code in different backends to handle backend formatting expectations than try and create the perfect all singing & dancing formatter for everyone.

Originally formatting was a "frontend" concern and that was a mistake, so I worked hard to move it out to the backend (since a backend might not even do formatting at all and just ship the data out in a protocol buffer or similar). So whatever improvements we come up with need to not assume that a backend must format things (e.g. by adding format specific code to common backend APIs).

A few well placed static helpers might be fine to share/support long term, but each backend could just live with it's own completely separate formatter as far as I'm concerned.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/flogger/issues/141#issuecomment-603214652, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCCXPSA5VXYPOWG4EKRZ2LRJCSYBANCNFSM4LNVWQPQ .

cslee00 avatar Mar 24 '20 13:03 cslee00

The parsers were written to handle existing formatting in Google.

Printf format was written with the idea that other people would extend it, but now it seems that that's probably a bad idea. Even if some of the legacy Java format placeholders could have better replacements (e.g. date/time stuff) I'm now convinced that the issues caused by being not 100% compatible with String.format() are not worth it.

I'll leave that there though if someone ever did want to extend it for their own fluent loggers.

The "Brace Style" parser is simplistic and not longer used in Google (it was for awkward cases where a helper class was using {n} style placeholders with a wrapper). I should raise an issue to remove that really since it isn't full MessageFormat syntax and I don't want to maintain it forever.

The parser is far faster than java.util.Formatter for most cases and provides a template which could (in theory) be cached for reuse. However in practice it's not been needed.

The parser API with its callbacks is something I'm expecting to be shared (i.e. take printf message + args and append to some Appendable). I'd want callbacks to be available to support structured logging too. That's about the extent of what I think is worth codifying into an API for this.

Logging tags is painful right now and the code that's there is a hack. Ideally there'd be a Log4J style formatter that can pull out specific tag values and put them in different places. E.g. having a format specifier like

"${timestamp:iso} [${level}]:${tag:status>' '}${message}${tags<' [context='>']'}"

To produce:

2020-03-24:16:52:23.3456 [INFO]:STATUS_TAG This is the formatted message [context=...]
  • Common timestamp formats known
  • Prefix '<' and postfix '>' strings for optional data
  • Clever enough to know that "${tags}" is the remaining tags after any named tags are formatted

Obviously format is very TBD.

Can't really comment on how it's done inside Google I'm afraid.

hagbard avatar Mar 24 '20 15:03 hagbard

And unless it wasn't clear, reusing the Log4J2 stuff would also be fine by me if there are no "show stoppers".

https://logging.apache.org/log4j/2.x/manual/layouts.html

Especially the bits about PatternLayout and MDC stuff.

hagbard avatar Mar 24 '20 16:03 hagbard