log
log copied to clipboard
Structured Logging
One of the major points of discussion during the log evaluation was structured logging, specifically the log
crate's relation to slog
. In the review we ended up concluding that there's no reason to not add structured logging to the log
crate and it'd be awesome to unify the slog/log ecosystems!
This issue is intended to serve as a tracking issue for concerns and ideas related to this. For example the specific APIs the log
crate would have, how this would relate to slog
, migration stories, etc.
Really? Awesome. Is that planned to be added before log 1.0.0
release?
The thought I had in the meeting was something like this:
info!("..."; MeaningfulType(information), PossiblyAnother(something_else));
where the currently selected logging implementation receives anything that matches its associated type, and any other types do nothing.
Key-value style structured logging in this paradigm may look something like this, where the kv
macro is provided by the structured logging library to construct the appropriate type:
info!("..."; kv!("a" => a, "b" => b));
EDIT: or obviously it could provide its own macros that replace the ones in log:
info!("..."; "a" => a, "b" => b);
@dpc that's the current intention, yeah, preparing this for the 1.0 release. We'd probably do it in 0.3 if we could, but it's unlikely to be possible without breaking changes! We're of course very eager to hear about everything you've done with slog
and the best way to make progress on this. Want to make sure you're kept very closely in the loop!
I will be happy to share my perspective and experiences during the discussion. Personally, I would be very happy to see as many features from slog
ported to log
so slog
could just reuse log
, or even be rendered completely unnecessary. I've also pinged slog
users in case they are interested - they probably know better than me what do they look for in structured logging.
The whole feature is implemented in slog
, and of course, I tried to make it as good as I could have imagined, so my initial point of view is: just take a look at https://docs.rs/slog/2.0.5/slog/trait.KV.html and https://docs.rs/slog/2.0.5/slog/trait.Serializer.html , and potentially other slog
pieces, and use it as a starting point for discussion. The KV
+ Serializer
are loosely mimicking serde
, just simplified (eg. no sequences support).
Edit: Also, reading the documentation of https://docs.rs/slog/2.0.5/slog/macro.log.html , might give a good overview of features supported. Eg. it's possibe to use just someval
in place of "k" => val
if someval
implements KV
so that somevar
can supply it's own key(s) (useful for logging structs)); or "k" => %v
if v
implements Display
etc.
When we do add structured logging, there's one question I'm wondering about: what do we encourage users to use?
Binary crates choosing either to go full-formatting (like current log
) or full-data is not a problem, but if we continue to support both, what should libraries choose?
I mean having a library output debug information in structured logging will give the end reader more flexibility, but I'm guessing it would also add a bit more overhead as well. If all a binary crate wants is a console stream of readable log messages, is it best to construct the additional structured log records and then just use a default conversion to a string?
I'm also curious what the best practices for this in slog
currently are, if that would be reasonable to expand on? I mean, so far my actual usage of log
has been a leveled println!()
replacement with the extra bonus that I get debug messages from my dependencies as well. Not that this has to be answered here - I should probably just go read the documentation - but it might be useful context for adding an implementation to the log
crate too!
Good question. I'm aware that not everyone buys into the whole structured logging concept. Especially in slog
where more "unusual" features are bundled together, people don't initially see the whole value proposition. Most people are used to "there is one global logger and I give lines of texts" logging and ask "why all this hassle?". I myself became "a believer" once I was working on a heavy-duty production service written in Go, where all the data was logged to json, collected to elastic search cluster where business analytics and monitoring was performed on it.
It would be useful to check experiences of any other programming language community. I'm not aware of any languages that have first-class, core-encouraged, structured logging support, but maybe there are some.
IMO, if structural logging is to be supported, then it should be encouraged, since as far as I can see it is clearly better. Even if you just log a stream on a console, doesn't screenshot on https://github.com/slog-rs/slog look better than a plain wall of text? Without structured logging, you can only distinguish between basics: logging level, timestamp and message, etc. With structured logging, there is much more flexibility. And libraries can't decide for all applications using them, what kind of logging should they need so they should default to structured, as it can cheaply and gracefully be downgraded to plain line of text.
The performance different is negligible. In slog
all structured logging key-value are just one additional pointer to KV
object data. If you do info!("received number: {}", n)
then there will be one "formatting to string" done for you. If you do info!("received"; "number" => n);
and then write it to stream, there will also be one pass of "formatting to string" done for you. As far as I can tell, any logging performance considerations should be mostly: avoid copying and allocation, not doing anything until needed, preparing as much as possible during compilation (static
etc.) and asynchronous logging (making sure passing logging data to another thread is cheap).
Thanks for going over that!
I think initially I was a bit worried that adding complexity to log
would scare new library authors away from adding logging into their libraries, but I think it won't be too bad.
The console logging does look nice! I mean, I would think it'd be nicer with data "inline" with the message, but then that would sacrifice all the rest of the flexibility.
To me,
weak encryption algorithm 'xor' requested by `82.9.9.9:42381` (host: localhost:8080, version 0.5)`
looks better than
weak encryption requested, algo: xor, port: 42381, peer_addr: 82.9.9.9, port: 8080, host: localhost, version: 0.5
... but I imagine having ecosystem-wide structured logging would be excellent for larger applications. The coloring available when data is separated like that isn't too bad either!
I think my ideal logging scheme would be something like info!("received number: {number}", number => n)
, with data both separated out into query-able structured records, and placed inline with the formatting - but that'd probably be a lot of work and complexity in teaching for little overall gain.
In any case, thanks for that!
@dtolnay: I propose a hopefully leaner/less magical implementation than making KV part of a macro as now in the slog
crate. What do you think about the logging function taking a single value-argument of a Serialize
type? The deeper structure or representation of a record is then left to the programmer. It would be quite easy to declare a wrapper type for emitting the logging record as some sort of KV-dictionary. The advantage is that any logging record has a meaningful, possibly varying type. And of course the less magic thing, new developers just starting to use logging would not have to learn the intricacies of another set of macros.
I just wanted to highlight one of the features I like best about slog
which is simply the ability to add structure to the resulting log message as different parts of the application get involved.
For example I have a Hyper Service
. Each Request
is given a unique logger instance early on in the call
function (a simple clone of a root logger, which is cheap in slog
).
Eventually the application accesses the logger and logs something about the Request
it has been working on, such as:
// .. current logger is acquired
info!(logger, "stored widget")
What is interesting however is that prior to this the logger has been extended by some other code (that I may or may not know about):
// .. current logger is acquired
let logger = logger.new(o!("request_id" => "123abc"));
// .. updated logger is made available for others as current logger
The output (assuming JSON formatter) is then roughly:
{"ts":"2017-06-19T02:25:08.521265800Z","level":"INFO","msg":"stored widget","request_id":"123abc"}
If I also added some external code to do authentication it might end up as:
{"ts":"2017-06-19T02:25:08.521265800Z","level":"INFO","msg":"stored widget","request_id":"123abc","user_id":100}
I understand that passing a logger
instance all over the place may seem clunky to some folks but if we can figure a way to maintain this functionality there are a lot of nice benefits that can be gained.
@bradleybeddoes I believe the part you're describing is called "contextual logging". I'm not sure if we should open another issue or consider it a part of this issue.
As long as it's possible to construct a Record
via the public API (which we're going to support), a contextual logging library can be built on top of log without too much trouble I don't think.
I probably don't understand what does "structured logging" mean but seems like people assume that logging automatically means terminal or streaming to a file and then reading that file, so you need it to look nice for that use-case.
My personal interest(I guess I'm selfish) is to keep the cost of the logger as low as possible because in my case(a lot of services on an embedded device, all sending logs over the network), people send thousands of logs per second(actually, this is the amount of logs coming from the device, not from every application separately) and I can't afford adding overhead. Here is the project I'm thinking about here(a bit inactive, sry...): https://github.com/lilianmoraru/dlt-rs
More info about DLT, here: https://at.projects.genivi.org/wiki/display/PROJ/About+DLT-Transport
It would be useful to check experiences of any other programming language community. I'm not aware of any languages that have first-class, core-encouraged, structured logging support, but maybe there are some.
In .NET we have serilog which uses a standard message templating syntax for structured logging.
@nblumhardt and I spent some time a while ago exploring a similar design for Rust. The idea is to log both a textual log message template along with the individual properties and additional context that make up a message. It's up to the particular sink to decide how it consumes the template and the properties, maybe by substituting properties in, composing a new object with a property containing the message, whatever. Libraries don't care where their logs go.
I'll admit I haven't spent a lot of time digging into the state of structured logging in Rust or the API log
exposes besides it's macros for a long time so will go and catch up.
One of the things we explored with emit was keeping very close to the API of the log
crate - just adding a little bit of structured-logging-sauce on top :-)
I.e. here's a typical log
event:
info!("Hello, {}!", env::var("USERNAME").unwrap());
And here's the structured extension along the lines of emit
:
info!("Hello, {}!", user: env::var("USERNAME").unwrap());
The two events would produce exactly the same textual output, but by naming the user
parameter, this would also be available as structured data attached to the event, if the log destination supported it.
Keeping a smooth continuum from completely unstructured, to fully-structured, log events might be preferable to splitting the API into two unrelated subsets, if it could be achieved in a non-breaking way :-)
How is the performance of such structured logging proposals?
"Message Templates" seems like it will involve string parsing/manipulations. Eg. It looks to me that emit
is dynamically allocating HashMap
every time it logs anything: https://github.com/emit-rs/emit/blob/master/src/lib.rs#L144 . Might be OK for languages that are generally not very performant and overallocate all over the place, but probably in Rust niche, it can be a problem.
The way structured logging is done in slog
is very performant, with ability to very quickly offload any formatting and output to different thread(s), with minimal work done by the current thread.
@dpc You're absolutely right that's an important point. And we expect a core Rust crate to cater to those use cases where performance is paramount. But how you capture properties is more of an implementation detail, whether it's a hash map or some kind of linked list or something else. I think we both agree that the latency of logging from a caller's perspective should be absolutely minimal, and the reality in either case is that a sink has the freedom to consume the structured log however it needs for its eventual destination.
The main thing I'd like to maintain is not having to pass a logger into the log calls, or have to carry a logger around in libraries for the sake of structured logging. I think it's still important to support that case, but in my experience it's been most helpful to grab the ambient logger and enrich it with some additional state as necessary rather than chain a logger throughout your application.
I think it would be awesome if we could get something like emit
's syntax working for this, glad you mentioned that nblumhardt.
Perhaps with a proc-macro-hack, or maybe even without, could the log!() macro be made to store a temporary for each argument, then put the arg in both a stack-allocated array of &[(&'static str, &Display)]
and pass it onto format_args!()?
A vec or hashmap could then be allocated by the logging backend if it needed to send the arguments to another thread, or any other permanent storage.
It's pretty straightforward to turn something like {foo: some_variable, bar: "some_literal"}
into &[("foo", &some_variable), ("bar", &"some_literal")]
in a normal macro-rules macro.
I would very much like to keep the "normal" logging path allocation free, and that shouldn't be too much harder to make happen with some structured logging.
chain a logger throughout your application
This is a part of "contextual logging", and is orthogonal to structured logging provided by slog. One can use slog-scope
and not worry about Logger
object if so inclined.
I was mostly wondering about:
log("User {username} logged in from {ip_address}", username, ipAddress)
from https://messagetemplates.org/. It is a nice idea, but I wonder how hard it will be to generate static information at compile time, using stable-Rust macros.
&[("foo", &some_variable), ("bar", &"some_literal")]
What will be the type of it? I was doing something like that in slog v1: &[&'static str, &Value]
where Value
was a trait. The trait was so that users can implement it for other types. But then I switched in slog v2 to just &KeyValuePairs
where KeyValuePairs
is a trait, since it had many benefits: it is possible to implement KeyValuePairs
for any type (arrays, tuple, heterogeneous list, structs, etc.). Key can be generated at runtime, supporting lazy values etc. The common case is of course impl KeyValuePairs for [(&'static str, T)]
. Since KeyValuePairs
provides potentially multiple key-value pairs, it can control order of pairs, or impl KeyValuePairs
for combination of contexts etc.
Now that I looked at https://messagetemplates.org/, I kind of wish I did it as so in slog. Though I still don't know how to write a macro for it.
@dpc
For basic logging, this is already somewhat supported with format_args!()
. Any custom macro would just have to send arguments to that correctly, and then also be able to use them to create an array.
This works in today's log:
info!("a {whatisit} message! {ending}", whatisit="keyword", ending=":)");
http://play.integer32.com/?gist=09caba59aeed61eb13ec7c3261b7c97c&version=stable
Hmmm... I liked the fact that one does not have to repeat itself in log("User {username} logged in from {ip_address}", username, ipAddress)
but maybe it is even better, since it's not position dependent.
Also, how to support "additional" keyvalues? Eg.
info!("a {whatisit} message! {ending}", whatisit="keyword", ending=":)", request_id=req_Id);
The request_id
is assigned to the key-value pairs, but is not actually used in textual message itself. Sometimes the number of additional data might be quite big, and not everything is neccessary in the text information - some might be mostly useful for robotos.
Somehow the macro would have to make distinction between the values used in the text message, and addition ones, as format_args!()
complains in any argument was not used (rightly so). Could be eg. additional separator.
info!("a {whatisit} message! {ending}", whatisit="keyword", ending=":)"; request_id=req_Id);
Also, what are the preferences for the delimiter between key and value? Slog uses =>
but it is kind of heavy. Would:
info!("a {whatisit} message! {ending}", whatisit: "keyword", ending: ":)"; request_id=req_Id);
be fine with everyone? Or are there any alternative preferences/ideas?
As far as I can tell, neither emit
or slog
seem to support arbitrary types as structured values (see emit's Value enum, slog's Serializer trait). I found myself wanting to log e.g. a UserContext struct but could not do so. I'd be very happy to see support for any value with a serde::Serializable implementation, or some similar trait. Of course as a workaround you can convert any struct to JSON and store that.
Regarding the logging macro syntax bikeshedding, I quite like keeping format_args!
as a subset, so using =
between key and value for string formatting. I slightly prefer ;
over ,
between string formatting and structured arguments. I quite like :
between key and value for the structured arguments, to match the struct literal syntax, but =>
and =
are also OK.
To summarise, I like:
info!("{} {msg}", "Hello", msg="world!"; k: v);
A lot of people asked for structured types in slog. Sadly it has many challenges. I didn't feel up to the task and just left it out. I think it's a complex subject, and maybe we could use a different github issue so we keep this one focused on the core structured logging.
At first I didn't really like the syntax of separating replaceable properties from other properties, because I see the message as just a view over the properties. But thinking about it, rejecting logs at compile time with messages that aren't properly replaceable sounds useful.
What's the status here?
Anything that can be done to move this forward?
The next step would be to design the API. In particular, we need to figure out what the interface to the structured data is. We want the logging codepath to be allocation-free, so it'll be some object-safe trait. It might end up looking a bit like rustc-serialize's Encodable
? We also need to figure out what the macro syntax would look like.
slog
also supports adding environment to the current logger, even if the logging statement is unaware of it. This way, you can e.g. log stuff from your packet parsing routine, but have a source IP address annotated because the surrounding code added it to the environment. https://github.com/slog-rs/slog/issues/124 is an issue about slog
seemingly needing allocations to do that (I lost track of that issue), but it would be nice if the log
crate could do the same without any allocation overhead.
fn decode(data: &[u8]) -> Result<Packet, Error> {
if data[1] != 0 {
warn!("padding not zeroed");
}
unimplemented!();
}
fn on_receive(from: IpAddr, data: &[u8]) {
with_kv!(from: from, {
let packet = decode(data);
// [...]
});
}
This way you can add annotation to log statements that aren't even in your library. (I don't remember the exact syntax slog
used, this example was just made up.)
It would be great to have the following optional notation
info!("..."; a, b);
This could men the same as
info!("..."; "a" => a, "b" => b);
See also https://github.com/rust-lang/rfcs/pull/2173 for an RFC about "A quick dbg macro"
BTW. Structured types have been added to slog behind a feature flag. https://docs.rs/slog/2.1.1/slog/trait.Serializer.html#method.emit_serde . I'm not sure if that's the best way to do it, but since people continuously asked about it, we agreed on some design and here it is.