hse Add LOG_DOMAIN to logging infrastructure

Domain logging is an idea I ripped from GLib. In GLib, you have a preprocessor macro that you can define called G_LOG_DOMAIN. If you define it, all logs that you emit will be prepended with your custom static string. By default, I think GLib uses the short program name.

In our case, the define is LOG_DOMAIN and you can use it like so:

#define LOG_DOMAIN "HSE/my-domain"

#include <hse_util/logging.h>

int func(void) { log_info("hello world"); }

This log will appear as:

[HSE/my-domain] func: hello world

This should help when grepping for related logs across files and functions.

Signed-off-by: Tristan Partin [email protected]

Aug 12 '22 00:08 tristan957

Not clear to me what this buys us??? The short program name is already output with each hse log messge as well as "[HSE]", so it seems we're adding just another copy of the program name by default? And if someone wants to change that message all log messages will have that custom identifier? So as far as I can tell it only lets you distinguish log messages from one run to another as long as you keep rebuilding with a different name from run to run?

What am I not understanding? Why not make the domain "HSE" by default to avoid duplication of info on each log line? I suppose if you redirect log output maybe you don't get the prog name, timestamp, and all that other stuff the syslog gives us?

Aug 12 '22 13:08 beckerg

Alex made a point the other day that we could maybe just drop all the custom logging code and just purely rely on syslog. I would be pretty happy with that honestly. Mucking with the logging code is no fun.

As far as the making the default log domain HSE, that could work. But then you lose the ability to grep for HSE related logs.

Going with Alex's idea of only using syslog, what if we did the following:

openlog("HSE", ...)
Keep this idea of the log domain just defaulted to an empty string or something like that.
journalctl makes it easy to filter based on PID, so we can use the first argument to openlog() as HSE's ident.

As for what this PR buys us, it is the ability to see related logs. If I define LOG_DOMAIN to be "kvset", then I can see all "kvset" related logs, whereas I don't have that ability right now. We could even adopt a pattern of hierarchical log domains. cn, cn/node, cn/kvset, etc. Then you can grep for a pattern of cn/.* and see all logs related to cn easily. Currently, I don't know how you would grep only for cn-related logs.

Aug 12 '22 15:08 tristan957

If you define LOG_DOMAIN=kvset, then don't all our log messages contain kvset?

Aug 12 '22 15:08 beckerg

I usually just egrep for partial function names that contain the log messages I'm interested in.. This is yet another reason all functions in a module should start with the same uniquely identifying prefix...

Aug 12 '22 15:08 beckerg

If you define LOG_DOMAIN=kvset, then don't all our log messages contain kvset?

Only the ones in that source file.

Aug 12 '22 15:08 tristan957

Ah, now I get it... Which is why I always use unique module prefix which gets your this DOMAIN functionality for free!

Aug 12 '22 16:08 beckerg

Alex made a point the other day that we could maybe just drop all the custom logging code and just purely rely on syslog

Correction: I said I'd like to reduce the layers between log_info() and syslog.

Things that HSE's logging layer provides that I think are useful:

pretty printing merr_t values
ev counters at each log site

I'm ambivalent on the value of LOG_DOMAIN. I've never felt the need to see all log messages from one file.

Aug 12 '22 21:08 alexttx

LOG_DOMAIN can span across files. And we don't currently log the file name either, so you can't even grep by that anyways. Nabeel's whole point in the other PR was that he wanted an easy way to view all logs related to node split. Right now that is impossible, so he has to hack something together. This would be a step in the direction of making this possible. the other step requires better code organization on our part, which has been a thorn in my side for a while!

Aug 12 '22 21:08 tristan957

Here are the docs from GLib on the subject, which I am arguing we should follow too:

Log domains may be used to broadly split up the origins of log messages. Typically, there are one or a few log domains per application or library. G_LOG_DOMAIN should be used to define the default log domain for the current compilation unit — it is typically defined at the top of a source file, or in the preprocessor flags for a group of source files.

Log domains must be unique, and it is recommended that they are the application or library name, optionally followed by a hyphen and a sub-domain name. For example, bloatpad or bloatpad-io.

After reading this, I think the mark should go, and all LOG_DOMAINs should be of the form HSE/xxx/yyy/....

Aug 12 '22 21:08 tristan957

hse hse copied to clipboard

Add LOG_DOMAIN to logging infrastructure

hse
hse copied to clipboard