log Tracking issue for kv_unstable (structured logging)

trafficstars

Tracking issue for the implementation of #296. See also #149

[x] Initial implementation (#324)
[ ] Integration with frameworks and loggers:
- [ ] slog
  - [ ] Zero-allocation conversion between slog::Value and log::kv::Value. We should be able to do zero-allocation conversion from log to slog, but may need to allocate from slog to log.
- [ ] tracing (https://github.com/tokio-rs/tracing/issues/341)
  - [x] Zero-allocation conversion between tracing::Value and log::kv::Value in the event! macro (#339)
  - [x] Capture std::error::Errors as Values
- [ ] fern
- [ ] async-log
- [ ] std-logger (https://github.com/Thomasdezeeuw/std-logger/pull/14)
  - [x] Some way to tell whether a Source has any key-value pairs (#329)
- [ ] femme
- [ ] env_logger (https://github.com/sebasmagri/env_logger/pull/137)
- [x] antlog (an experimental wrapper to prove out requirements)
  - [x] A way to get specific values (#335)
  - [x] A way to coerce a value to a common type (#379)
- Please suggest others to track!
[x] Explore macro support (#343)
- [x] log!("msg {}", interpolated, { key: value }) (#353) - We ended up having to revert this in log, but are experimenting externally
- [x] log_kv_macro
- [x] log_macros
[x] Integration with serialization frameworks:
- [x] serde
  - [x] Conversion between serde::Serialize and std::fmt::Debug (https://github.com/rust-lang/rfcs/pull/2696). See serde_fmt.
- [x] sval (#342)
[ ] Micro benchmarks
[ ] Documentation and examples
[ ] Stabilization review
[ ] Open questions
- [ ] Should Source::count return a Result<usize> instead of a plain usize?

Apr 29 '19 04:04 KodrAus

@KodrAus thanks for all the work you've put into this (and linking to this issue from a forum post).

I was curious: is there a release planned that includes the beta version of kv? I actually wanted to use the beta for async-log, but given there wasn't any release yet it wasn't possible :sweat_smile:

Jul 02 '19 12:07 yoshuawuyts

Hi @yoshuawuyts! :wave:

Right now any release of log we publish will include this API under the kv_unstable feature gate, but we're not ready to commit to backwards compatibility just yet. So if you depend on log from git then you can access the API:

[dependencies.log]
git = "https://github.com/rust-lang-nursery/log.git"
features = ["kv_unstable"]

I think what we need are a few draft PRs in some of these log frameworks that tease out any breaking changes we should make, and then will be able to publish a release that replaces the kv_unstable feature gate with a kv one.

Jul 04 '19 02:07 KodrAus

Right now any release of log we publish will include this API under the kv_unstable feature gate,

Oh oops, maybe I was being a bit vague. The last release on crates.io was done in October 2018, which means currently no releases ship with kv_unstable on crates.io. Which in turn (I believe) means we can't ship any versions to crates.io.

In order to help test this feature, it would be nice if there was a release on crates.io with the ["kv_unstable"] feature available. For experimental technology such as async-log I think we can get away with publishing a version that directly depends on this feature, and keep updating it as it's polished.

So my question is: could a version be of log be published to crates.io that includes the experimental flag?

P.S. Thanks for taking time out of your day to reply to my questions!

Jul 04 '19 09:07 yoshuawuyts

In order to help test this feature, it would be nice if there was a release on crates.io with the ["kv_unstable"] feature available. For experimental technology such as async-log I think we can get away with publishing a version that directly depends on this feature, and keep updating it as it's polished.

Ah this sounds fair enough. I think we can get pretty far without needing to break anything, but it's probably going to happen at some point. But right now it is harder for folks to get started with these new APIs, so once #339 lands I'll put together a release so you can depend on it from crates.io and browse the docs on docs.rs.

Jul 06 '19 01:07 KodrAus

@KodrAus in the top-level post there's an entry about "Explore macro support". Do you have any plans to experiment with this?

We've integrated key-value logging in https://github.com/lrlna/femme/pull/1; all that's missing now is a way to create key-value pairs and send them to a logger. Even a small code example would be helpful, but ideally there'd be a small log crate that could work with this.

Also: is it perhaps useful to open a separate issue to track/discuss logging macros?

edit 2019-07-21: Ah, I think I figured out how to log a kv-pair!

let record = log::Record::builder().key_values(kv_pairs).build();
log::logger().log(&record);

Jul 21 '19 14:07 yoshuawuyts

Ah yes that Record::key_values method is the way you can attach key-value pairs to a record in the absence of macro support. It needs to be a &dyn Source.

Opening a separate issue to design macro support sounds like a good idea! We’ve got a couple of constraints but also plenty of opportunity to improve things.

Jul 21 '19 21:07 KodrAus

Macro issue created: https://github.com/rust-lang-nursery/log/issues/343

Jul 22 '19 10:07 yoshuawuyts

Created a crate for shared types in key-value logs (to mark pairs of logs) -- https://github.com/yoshuawuyts/log-types. Not sure yet how useful this is, but I think it might be a fun reference of how to create source/key/value structs, and probably a useful experiment (:

Sep 01 '19 09:09 yoshuawuyts

Having some standards around well-known key-value pairs is something I've also been exploring in a little wrapper. It seems like it'll be useful for a more composable logging ecosystem.

Sep 01 '19 22:09 KodrAus

@KodrAus Yeah for sure! -- I wonder which others to add; I only really thought of spans::{start,end} and http::{request,response}. Though I guess timestamps, host, ip might be common too?

Oh also, I saw femme is not listed yet; could we perhaps add it to the tracking issue?

Sep 01 '19 23:09 yoshuawuyts

I'm hesitant to add to the list of todos because I really want this feature, but it might be worth adding fern as another crate to integrate with. I don't use it (I use slog), but it seems like it has quite a lot of usage (600,000+ all time downloads on crates.io).

Sep 17 '19 13:09 atroche

Should Source::count return a Result<usize> instead of a plain usize? What if we move the Source::count to a separate trait:

trait SourceCount: Source {
    fn count(&self) -> usize;
}

Rationale is it looks more reasonable to just make it a type-system question whether a certain type has a notion of a count of it's kvs or not.

And if that's not the desired use case for making count return Result<usize> - then what is? What do you see as possible error values? Why not just Option<usize>? Some lazy-evaluated kv sources might not have count known upfront, i.e. something like processes list (ps ax) might be collected on demand.

Oct 16 '19 22:10 MOZGIII

What if we move the Source::count to a separate trait

@MOZGIII unfortunately a separate trait wouldn't really work because we'd lose the implementation when erasing as dyn Source.

What do you see as possible error values?

The desire for Result would just be to be able to pass through any errors from the default implementation that calls the fallible visit_pair method. But Option seems worthwhile!

Some lazy-evaluated kv sources might not have count known upfront

Hmm, I don't think the current implementation really supports this kind of scenario because pairs can be visited and fetched at any point. I also don't think it's really desirable to support either, because pairs may be inspected and visited multiple times any expensive work or externally changing information could be confusing to end consumers.

Oct 17 '19 00:10 KodrAus

Given all that, I'm having hard time imagining a case when the count would be None (or Err). Do you have an example in mind? How is the count used? Is it like iterator's size_hint?

Oct 17 '19 09:10 MOZGIII

Hmm, I don't think the current implementation really supports this kind of scenario because pairs can be visited and fetched at any point. I also don't think it's really desirable to support either, because pairs may be inspected and visited multiple times any expensive work or externally changing information could be confusing to end consumers.

From how the code is set up, it looks fine: the Source can be used at any time to collect values via Source::visit. There is an implementation note on visit fn A source should yield the same key-value pairs to a subsequent visitor unless that visitor itself fails. that advocates against sources being dynamic. After looking around the documentation more, I found the uses for Source, and it makes a lot of sense for it to be required to not be dynamic. :+1:

Oct 17 '19 10:10 MOZGIII

I've spent a little time looking at how we could support attempting to coerce structured values to concrete types. This is something that would make logging frameworks that want to enrich log::Records with their own well-known types nicer. There's a PR in #379

Once we have that and #378, we can add support for std::error::Error to structured values too. It might look something like:

impl<'v> Value<'v> {
    pub fn from_error<E>(err: &'v E) -> Self
    where
        E: Error + 'static;

    pub fn get_error(&self) -> Option<&(dyn Error + 'static)>;
}

Jan 31 '20 05:01 KodrAus

I see there's a 'kv_unstable' feature flag in the latest version of the log crate but it's unclear how to interface with in from an application. Is this not something we can do today?

Jun 28 '20 22:06 softprops

I found this commit which seems like it reverted an interface users could leverage is https://docs.rs/kv-log-macro the crate users should be guided towards?

Jun 28 '20 22:06 softprops

@softprops yeah, we're still in the process of figuring out the macros. Currently kv-log-macro is one of the most convenient ways to use the kv feature, but the goal is to eventually have this functionality in log's macros itself.

Jun 30 '20 07:06 yoshuawuyts

Have we figured out other ways to preserve states? The current log!("", { key: value }) will be quite duplicated when the same { key: value } needs to be reused in different scopes.

Jun 30 '20 14:06 pickfire

@yoshuawuyts and I discussed the path to stabilization for the kv support and came to the conclusion that there's really not much blocking calling the kv module stable and continuing to work on it without breakage. I think we've explored enough of the design space to cover macro support and integration with tracing properly.

What about macros?

We don't currently have macro support in log itself, but since we're bound to std::fmt the macros we can provide in here in a backwards compatible way can't be as capable as what we produce externally. We will still provide macro support in log itself, but that doesn't need to block removing the kv_unstable feature gate.

Right now, if you want macro support you can check out kv_log_macro. It's an almost drop-in replacement for the existing macros.

I'm also experimenting with something completely greenfield for macros that I hope to talk about more when it's fleshed out.

So what's left?

I'd like to get #396 merged, which lets us capture values using a trait bound like impl Display + 'static but still determine at compile-time whether the actual value is one of a number of primitives like i32. Until 1.46.0 is released in August it'll have a small runtime hit though. I think that's the last breaking behavioural change we need to make then can do a thorough API review.

What's after stabilization?

@yoshuawuyts brought up the possibility of adding logging/tracing support into std, which I think is worth exploring. I've been framing it as a structured/contextual enhancement to std::fmt but going down that route might be the best possible way to make the existing log macros natively structured.

Jun 30 '20 23:06 KodrAus

What's the story for the kv visitor trait and serde. It seems today the visit api gives you a access to a Key and Value. You can get the &str value of a Key but values are a bit opaque. I know that they implement Display and Debug but i'd rather depend on their serde::Serialize impl when serializing them out.

Jul 06 '20 05:07 softprops

Hey @softprops 👋

It would be pretty unfortunate for structured values to only support the unstructured Debug or Display outputs. I wrote a bit about serde in https://github.com/rust-lang/log/issues/388#issuecomment-629572564.

The gist of it is we currently support serde in a roundabout way using a different structured framework, but should definitely add first-class serde support using erased-serde (we could fall back to the alternative framework when alloc isn’t available). There’s nothing blocking this and it can be implemented at any time.

Jul 06 '20 13:07 KodrAus

@KodrAus that's awesome! Is anyone actively working on this? I'd love to help but am not caught up on all the current and prior discussion.

Jul 06 '20 13:07 softprops

@softprops As far as I know it’s not something being actively worked on yet. If you’d like to take a look that would be great! Otherwise it was next on my list after #396, since that changes the shape of the value internals a bit.

To add serde support we’ll effectively add the following public API:

impl<‘v> Value<‘v> {
    pub fn from_serde(value: &’v (impl serde::Serialize + ‘static)) -> Self;
}

impl<‘v> serde::Serialize for Value<‘v> {}

The things we’d need to do internally would be:

Add optional dependencies on erased-serde and serde-fmt
Add a kv_unstable_serde feature that pulls in the std feature, sval/serde, serde-fmt and serde
Add a new kv::value::internal::serde module, probably following a similar structure to internal::sval
When both kv_unstable_sval and kv_unstable_serde are available, we can use sval‘s serde bridge to implement the Visitor::sval method in the serde module and the Visitor::serde method in the sval module
Use serde-fmt to implement the Visitor::debug and Visitor::display methods in the serde module
Add a new Serde variant to internal::Inner
Add a new serde method to internal::Visitor
Add kv_unstable_serde to our GitHub Actions on its own and combined with kv_unstable_sval

But if anybody waned to dig in I’d be happy to help! The best place to reach out would probably be on Zulip.

Jul 12 '20 21:07 KodrAus

This sounds great. I'll be sure to drop a note here if I start anything.

Jul 12 '20 21:07 softprops

I had some issues getting 0.4.11 structured logging working. I've written my own logging macros because I have some specific requirements, and I'm also writing my own Logger to receive all this data because the outgoing logging channels I have to interface to are custom. I chose to put it all through log crate because I also want to capture unstructured logging. Some observations/questions:

If the Visitor only has one method, it would be better to just replace it with a FnMut, making Logger code simpler
However right now to get a binary value out, the Visitor has to call various different to_* methods in turn on the Value to find what type it contains, so wouldn't it be better to turn that around and have several different methods on the Visitor which would be called according to the type of the value? e.g. visit_pair_u64, visit_pair_str, etc. The types handled by OpenTelemetry for example are limited to i64, bool, f64 and str, plus arrays and tables. So ignoring the nested values for the moment, and adding u64, that requires just five visitor methods to cover it.
It might be good to point people to the fact that Value implements Display, because for text-log output, I started out by querying the different to_* methods and formatting for each one. However this didn't work because I couldn't find a way to get a value created with Value::from_display out using the to_* methods. So maybe it would be best to have two visit calls, one for text-style output, which just takes a FnMut and gives it a &str for each value (which would point to either the original value or else an internal temporary format buffer), and another for getting binary data out, which has a Visitor with multiple entry-points (for i64/u64/f64/bool/str/etc). I haven't implemented binary output yet, but it seems like right now I'll have to try various to_* methods in turn, and then finally try a format!("{}", val). This doesn't seem very efficient. (The log crate is advertised as "lightweight".)

I was trying to understand the current code. There is a lot of dynamic dispatch going on, and a lot of type machinery. Is all this necessary? If it is, then fine. However did you consider something like a HCons-style approach? i.e. construct a nested type which would result in all the heterogenous keys/values being stored sequentially in stack memory with minimal padding (e.g. KvInt(("key", 123, KvStr(("key2", &strvar, KvEnd))))), and let the compiler build a single visit function for it all that accepts a Visitor and visits them all (where the Visitor has an entry-point for each fundamental supported type). That would make good use of static knowledge from the callsite to optimise at least the first level of processing. (Nested arrays or tables would need some kind of dynamic approach.) I can provide some prototype code if this isn't clear.

Oct 08 '20 20:10 uazu

Actually, thinking about it some more, would it be possible to make it work with a closure like this:

logger.log(&Record::builder()
           .whatever(...)
           .key_values(|visitor| {
               Value::from(&myvar1).visit(visitor, "key1");
               Value::from(&myvar2).visit(visitor, "key2");
               /// etc
           })
           .build());

That way, assuming all the Value-related calls are marked #[inline] it should all get inlined to produce flat code that just executes a series of calls on the visitor instance. This assumes that the Visitor has separate entry points for major types (u64, i64, f64, bool, str). The closure will build a hidden structure on the stack which contains the references to all the variables being logged. So this should be really efficient and lightweight. I could prototype this if it is of any interest.

Oct 09 '20 10:10 uazu

Hey @uazu! :wave:

Sorry for taking so long to circle back and thanks for putting together your thoughts 😃

If the Visitor only has one method, it would be better to just replace it with a FnMut, making Logger code simpler

The reason Visitor is a trait instead of just a FnMut is that it gives us a place to add provided methods like get and count, so Visitor::visit_pair is like Iterator::next. You only implement that one method but get all the other combinators on top. For your last example, you can already do this:

logger.log(&Record::builder()
           .whatever(...)
           .key_values(&[
               ("key1", Value::from(&myvar1)),
               ("key2", Value::from(&myvar2)),
               /// etc
           ])
           .build());

However right now to get a binary value out, the Visitor has to call various different to_* methods in turn on the Value to find what type it contains, so wouldn't it be better to turn that around and have several different methods on the Visitor which would be called according to the type of the value? e.g. visit_pair_u64, visit_pair_str, etc. The types handled by OpenTelemetry for example are limited to i64, bool, f64 and str, plus arrays and tables. So ignoring the nested values for the moment, and adding u64, that requires just five visitor methods to cover it.

The goal of the Value::to_* methods is for more ad-hoc inspection. If you're writing a log sink that wants to do something particular with a timestamp or trace identifier for instance you can check whether the key-value pairs you're given contains a value called timestamp, and then check whether that timestamp is a string or a chrono::DateTime<Utc>. They're really inconvenient for trying to serialize structured values directly. In order to visit all pairs and serialize them you use either the sval::Value or serde::Serialize implementations which have the API you're describing.

The kicker for the visitor API is the plus arrays and tables part. Once you have to support complex datastructures the API becomes a lot more complex and the open design space a lot bigger than just simple primitives. Not supporting arrays and maps isn't really an option because you certainly want to log them. So we offer two options in log for properly serializing values: there's sval for no-std serialization and serde for wider compatibility. Both can handle any shaped datastructures and require fairly large APIs to do so.

There is a lot of dynamic dispatch going on, and a lot of type machinery. Is all this necessary? If it is, then fine. However did you consider something like a HCons-style approach?

I'd be interested to see a HCons-style approach! My experience with H-lists in Rust has been that they're somewhat going against the grain of the language and so require a lot more type machinery, various hacks, and tend to produce large and incomprehensible error messages. One of the goals of log as well is to generate as little code in callsites as possible, ideally just a single function call. But that doesn't mean exploring different approaches isn't worthwhile! We settled on trait objects fairly early and never deeply considered trying to go the other way into fully generic code.

Dec 21 '20 01:12 KodrAus

@KodrAus Thanks for replying.

In my opinion, since 'log' is supposed to be a lightweight logging crate, the key-value logging mechanism should be lightweight too. It doesn't need to support everything -- a superset of JSON and OpenTelemetry style values would be enough. It should be streamlined for dumping the data to a stream, since that's the most common operation.

In case it's of interest: My original reason to look into this was to add minimal key-value logging support to my Stakker crate. In the end I implemented my own minimal logging interface directly in Stakker, with the logging macros in a separate crate: stakker_log. This was done because none of the crates out there did quite what I needed. Also because I needed to transparently pass through a span ID (which is effectively an actor number) with the log records, to support span-based logging. Also, since Stakker is single-threaded, it's more efficient to log within the same thread, rather than to something which then has to use a Mutex or whatever. The intention is that the user-provided logging handler could either dump out directly, or forward logging to whatever logging crate the user prefers. This does also handle nested values (arrays and maps), although in a very minimal visitor style: just a closure at the point of logging which dumps the data as calls to a visitor closure which outputs it.

Dec 21 '20 20:12 uazu

log log copied to clipboard

Tracking issue for kv_unstable (structured logging)

What about macros?

So what's left?

What's after stabilization?

log
log copied to clipboard