log
log copied to clipboard
Tracking issue for kv_unstable (structured logging)
Tracking issue for the implementation of #296. See also #149
- [x] Initial implementation (#324)
- [ ] Integration with frameworks and loggers:
- [ ]
slog- [ ] Zero-allocation conversion between
slog::Valueandlog::kv::Value. We should be able to do zero-allocation conversion fromlogtoslog, but may need to allocate fromslogtolog.
- [ ] Zero-allocation conversion between
- [ ]
tracing(https://github.com/tokio-rs/tracing/issues/341)- [x] Zero-allocation conversion between
tracing::Valueandlog::kv::Valuein theevent!macro (#339) - [x] Capture
std::error::Errors asValues
- [x] Zero-allocation conversion between
- [ ]
fern - [ ]
async-log - [ ]
std-logger(https://github.com/Thomasdezeeuw/std-logger/pull/14)- [x] Some way to tell whether a
Sourcehas any key-value pairs (#329)
- [x] Some way to tell whether a
- [ ]
femme - [ ]
env_logger(https://github.com/sebasmagri/env_logger/pull/137) - [x]
antlog(an experimental wrapper to prove out requirements)- [x] A way to get specific values (#335)
- [x] A way to coerce a value to a common type (#379)
- Please suggest others to track!
- [ ]
- [x] Explore macro support (#343)
- [x]
log!("msg {}", interpolated, { key: value })(#353) - We ended up having to revert this inlog, but are experimenting externally - [x]
log_kv_macro - [x]
log_macros
- [x]
- [x] Integration with serialization frameworks:
- [x]
serde- [x] Conversion between
serde::Serializeandstd::fmt::Debug(https://github.com/rust-lang/rfcs/pull/2696). Seeserde_fmt.
- [x] Conversion between
- [x]
sval(#342)
- [x]
- [ ] Micro benchmarks
- [ ] Documentation and examples
- [ ] Stabilization review
- [ ] Open questions
- [ ] Should
Source::countreturn aResult<usize>instead of a plainusize?
- [ ] Should
@KodrAus thanks for all the work you've put into this (and linking to this issue from a forum post).
I was curious: is there a release planned that includes the beta version of kv? I actually wanted to use the beta for async-log, but given there wasn't any release yet it wasn't possible :sweat_smile:
Hi @yoshuawuyts! :wave:
Right now any release of log we publish will include this API under the kv_unstable feature gate, but we're not ready to commit to backwards compatibility just yet. So if you depend on log from git then you can access the API:
[dependencies.log]
git = "https://github.com/rust-lang-nursery/log.git"
features = ["kv_unstable"]
I think what we need are a few draft PRs in some of these log frameworks that tease out any breaking changes we should make, and then will be able to publish a release that replaces the kv_unstable feature gate with a kv one.
Right now any release of log we publish will include this API under the kv_unstable feature gate,
Oh oops, maybe I was being a bit vague. The last release on crates.io was done in October 2018, which means currently no releases ship with kv_unstable on crates.io. Which in turn (I believe) means we can't ship any versions to crates.io.
In order to help test this feature, it would be nice if there was a release on crates.io with the ["kv_unstable"] feature available. For experimental technology such as async-log I think we can get away with publishing a version that directly depends on this feature, and keep updating it as it's polished.
So my question is: could a version be of log be published to crates.io that includes the experimental flag?
P.S. Thanks for taking time out of your day to reply to my questions!
In order to help test this feature, it would be nice if there was a release on crates.io with the ["kv_unstable"] feature available. For experimental technology such as async-log I think we can get away with publishing a version that directly depends on this feature, and keep updating it as it's polished.
Ah this sounds fair enough. I think we can get pretty far without needing to break anything, but it's probably going to happen at some point. But right now it is harder for folks to get started with these new APIs, so once #339 lands I'll put together a release so you can depend on it from crates.io and browse the docs on docs.rs.
@KodrAus in the top-level post there's an entry about "Explore macro support". Do you have any plans to experiment with this?
We've integrated key-value logging in https://github.com/lrlna/femme/pull/1; all that's missing now is a way to create key-value pairs and send them to a logger. Even a small code example would be helpful, but ideally there'd be a small log crate that could work with this.
Also: is it perhaps useful to open a separate issue to track/discuss logging macros?
edit 2019-07-21: Ah, I think I figured out how to log a kv-pair!
let record = log::Record::builder().key_values(kv_pairs).build();
log::logger().log(&record);
Ah yes that Record::key_values method is the way you can attach key-value pairs to a record in the absence of macro support. It needs to be a &dyn Source.
Opening a separate issue to design macro support sounds like a good idea! We’ve got a couple of constraints but also plenty of opportunity to improve things.
Macro issue created: https://github.com/rust-lang-nursery/log/issues/343
Created a crate for shared types in key-value logs (to mark pairs of logs) -- https://github.com/yoshuawuyts/log-types. Not sure yet how useful this is, but I think it might be a fun reference of how to create source/key/value structs, and probably a useful experiment (:
Having some standards around well-known key-value pairs is something I've also been exploring in a little wrapper. It seems like it'll be useful for a more composable logging ecosystem.
@KodrAus Yeah for sure! -- I wonder which others to add; I only really thought of spans::{start,end} and http::{request,response}. Though I guess timestamps, host, ip might be common too?
Oh also, I saw femme is not listed yet; could we perhaps add it to the tracking issue?
I'm hesitant to add to the list of todos because I really want this feature, but it might be worth adding fern as another crate to integrate with. I don't use it (I use slog), but it seems like it has quite a lot of usage (600,000+ all time downloads on crates.io).
Should
Source::countreturn aResult<usize>instead of a plainusize? What if we move theSource::countto a separate trait:
trait SourceCount: Source {
fn count(&self) -> usize;
}
Rationale is it looks more reasonable to just make it a type-system question whether a certain type has a notion of a count of it's kvs or not.
And if that's not the desired use case for making count return Result<usize> - then what is? What do you see as possible error values? Why not just Option<usize>? Some lazy-evaluated kv sources might not have count known upfront, i.e. something like processes list (ps ax) might be collected on demand.
What if we move the Source::count to a separate trait
@MOZGIII unfortunately a separate trait wouldn't really work because we'd lose the implementation when erasing as dyn Source.
What do you see as possible error values?
The desire for Result would just be to be able to pass through any errors from the default implementation that calls the fallible visit_pair method. But Option seems worthwhile!
Some lazy-evaluated kv sources might not have count known upfront
Hmm, I don't think the current implementation really supports this kind of scenario because pairs can be visited and fetched at any point. I also don't think it's really desirable to support either, because pairs may be inspected and visited multiple times any expensive work or externally changing information could be confusing to end consumers.
Given all that, I'm having hard time imagining a case when the count would be None (or Err). Do you have an example in mind?
How is the count used? Is it like iterator's size_hint?
Hmm, I don't think the current implementation really supports this kind of scenario because pairs can be visited and fetched at any point. I also don't think it's really desirable to support either, because pairs may be inspected and visited multiple times any expensive work or externally changing information could be confusing to end consumers.
From how the code is set up, it looks fine: the Source can be used at any time to collect values via Source::visit. There is an implementation note on visit fn A source should yield the same key-value pairs to a subsequent visitor unless that visitor itself fails. that advocates against sources being dynamic. After looking around the documentation more, I found the uses for Source, and it makes a lot of sense for it to be required to not be dynamic. :+1:
I've spent a little time looking at how we could support attempting to coerce structured values to concrete types. This is something that would make logging frameworks that want to enrich log::Records with their own well-known types nicer. There's a PR in #379
Once we have that and #378, we can add support for std::error::Error to structured values too. It might look something like:
impl<'v> Value<'v> {
pub fn from_error<E>(err: &'v E) -> Self
where
E: Error + 'static;
pub fn get_error(&self) -> Option<&(dyn Error + 'static)>;
}
I see there's a 'kv_unstable' feature flag in the latest version of the log crate but it's unclear how to interface with in from an application. Is this not something we can do today?
I found this commit which seems like it reverted an interface users could leverage is https://docs.rs/kv-log-macro the crate users should be guided towards?
@softprops yeah, we're still in the process of figuring out the macros. Currently kv-log-macro is one of the most convenient ways to use the kv feature, but the goal is to eventually have this functionality in log's macros itself.
Have we figured out other ways to preserve states? The current log!("", { key: value }) will be quite duplicated when the same { key: value } needs to be reused in different scopes.
@yoshuawuyts and I discussed the path to stabilization for the kv support and came to the conclusion that there's really not much blocking calling the kv module stable and continuing to work on it without breakage. I think we've explored enough of the design space to cover macro support and integration with tracing properly.
What about macros?
We don't currently have macro support in log itself, but since we're bound to std::fmt the macros we can provide in here in a backwards compatible way can't be as capable as what we produce externally. We will still provide macro support in log itself, but that doesn't need to block removing the kv_unstable feature gate.
Right now, if you want macro support you can check out kv_log_macro. It's an almost drop-in replacement for the existing macros.
I'm also experimenting with something completely greenfield for macros that I hope to talk about more when it's fleshed out.
So what's left?
I'd like to get #396 merged, which lets us capture values using a trait bound like impl Display + 'static but still determine at compile-time whether the actual value is one of a number of primitives like i32. Until 1.46.0 is released in August it'll have a small runtime hit though. I think that's the last breaking behavioural change we need to make then can do a thorough API review.
What's after stabilization?
@yoshuawuyts brought up the possibility of adding logging/tracing support into std, which I think is worth exploring. I've been framing it as a structured/contextual enhancement to std::fmt but going down that route might be the best possible way to make the existing log macros natively structured.
What's the story for the kv visitor trait and serde. It seems today the visit api gives you a access to a Key and Value. You can get the &str value of a Key but values are a bit opaque. I know that they implement Display and Debug but i'd rather depend on their serde::Serialize impl when serializing them out.
Hey @softprops 👋
It would be pretty unfortunate for structured values to only support the unstructured Debug or Display outputs. I wrote a bit about serde in https://github.com/rust-lang/log/issues/388#issuecomment-629572564.
The gist of it is we currently support serde in a roundabout way using a different structured framework, but should definitely add first-class serde support using erased-serde (we could fall back to the alternative framework when alloc isn’t available). There’s nothing blocking this and it can be implemented at any time.
@KodrAus that's awesome! Is anyone actively working on this? I'd love to help but am not caught up on all the current and prior discussion.
@softprops As far as I know it’s not something being actively worked on yet. If you’d like to take a look that would be great! Otherwise it was next on my list after #396, since that changes the shape of the value internals a bit.
To add serde support we’ll effectively add the following public API:
impl<‘v> Value<‘v> {
pub fn from_serde(value: &’v (impl serde::Serialize + ‘static)) -> Self;
}
impl<‘v> serde::Serialize for Value<‘v> {}
The things we’d need to do internally would be:
- Add optional dependencies on
erased-serdeandserde-fmt - Add a
kv_unstable_serdefeature that pulls in thestdfeature,sval/serde,serde-fmtandserde - Add a new
kv::value::internal::serdemodule, probably following a similar structure tointernal::sval - When both
kv_unstable_svalandkv_unstable_serdeare available, we can usesval‘sserdebridge to implement theVisitor::svalmethod in theserdemodule and theVisitor::serdemethod in thesvalmodule - Use
serde-fmtto implement theVisitor::debugandVisitor::displaymethods in theserdemodule - Add a new
Serdevariant tointernal::Inner - Add a new
serdemethod tointernal::Visitor - Add
kv_unstable_serdeto our GitHub Actions on its own and combined withkv_unstable_sval
But if anybody waned to dig in I’d be happy to help! The best place to reach out would probably be on Zulip.
This sounds great. I'll be sure to drop a note here if I start anything.
I had some issues getting 0.4.11 structured logging working. I've written my own logging macros because I have some specific requirements, and I'm also writing my own Logger to receive all this data because the outgoing logging channels I have to interface to are custom. I chose to put it all through log crate because I also want to capture unstructured logging. Some observations/questions:
- If the Visitor only has one method, it would be better to just replace it with a FnMut, making Logger code simpler
- However right now to get a binary value out, the Visitor has to call various different
to_*methods in turn on the Value to find what type it contains, so wouldn't it be better to turn that around and have several different methods on the Visitor which would be called according to the type of the value? e.g. visit_pair_u64, visit_pair_str, etc. The types handled by OpenTelemetry for example are limited toi64,bool,f64andstr, plus arrays and tables. So ignoring the nested values for the moment, and addingu64, that requires just five visitor methods to cover it. - It might be good to point people to the fact that
ValueimplementsDisplay, because for text-log output, I started out by querying the differentto_*methods and formatting for each one. However this didn't work because I couldn't find a way to get a value created withValue::from_displayout using theto_*methods. So maybe it would be best to have twovisitcalls, one for text-style output, which just takes a FnMut and gives it a&strfor each value (which would point to either the original value or else an internal temporary format buffer), and another for getting binary data out, which has a Visitor with multiple entry-points (for i64/u64/f64/bool/str/etc). I haven't implemented binary output yet, but it seems like right now I'll have to try variousto_*methods in turn, and then finally try aformat!("{}", val). This doesn't seem very efficient. (The log crate is advertised as "lightweight".)
I was trying to understand the current code. There is a lot of dynamic dispatch going on, and a lot of type machinery. Is all this necessary? If it is, then fine. However did you consider something like a HCons-style approach? i.e. construct a nested type which would result in all the heterogenous keys/values being stored sequentially in stack memory with minimal padding (e.g. KvInt(("key", 123, KvStr(("key2", &strvar, KvEnd))))), and let the compiler build a single visit function for it all that accepts a Visitor and visits them all (where the Visitor has an entry-point for each fundamental supported type). That would make good use of static knowledge from the callsite to optimise at least the first level of processing. (Nested arrays or tables would need some kind of dynamic approach.) I can provide some prototype code if this isn't clear.
Actually, thinking about it some more, would it be possible to make it work with a closure like this:
logger.log(&Record::builder()
.whatever(...)
.key_values(|visitor| {
Value::from(&myvar1).visit(visitor, "key1");
Value::from(&myvar2).visit(visitor, "key2");
/// etc
})
.build());
That way, assuming all the Value-related calls are marked #[inline] it should all get inlined to produce flat code that just executes a series of calls on the visitor instance. This assumes that the Visitor has separate entry points for major types (u64, i64, f64, bool, str). The closure will build a hidden structure on the stack which contains the references to all the variables being logged. So this should be really efficient and lightweight. I could prototype this if it is of any interest.
Hey @uazu! :wave:
Sorry for taking so long to circle back and thanks for putting together your thoughts 😃
If the Visitor only has one method, it would be better to just replace it with a FnMut, making Logger code simpler
The reason Visitor is a trait instead of just a FnMut is that it gives us a place to add provided methods like get and count, so Visitor::visit_pair is like Iterator::next. You only implement that one method but get all the other combinators on top. For your last example, you can already do this:
logger.log(&Record::builder()
.whatever(...)
.key_values(&[
("key1", Value::from(&myvar1)),
("key2", Value::from(&myvar2)),
/// etc
])
.build());
However right now to get a binary value out, the Visitor has to call various different to_* methods in turn on the Value to find what type it contains, so wouldn't it be better to turn that around and have several different methods on the Visitor which would be called according to the type of the value? e.g. visit_pair_u64, visit_pair_str, etc. The types handled by OpenTelemetry for example are limited to i64, bool, f64 and str, plus arrays and tables. So ignoring the nested values for the moment, and adding u64, that requires just five visitor methods to cover it.
The goal of the Value::to_* methods is for more ad-hoc inspection. If you're writing a log sink that wants to do something particular with a timestamp or trace identifier for instance you can check whether the key-value pairs you're given contains a value called timestamp, and then check whether that timestamp is a string or a chrono::DateTime<Utc>. They're really inconvenient for trying to serialize structured values directly. In order to visit all pairs and serialize them you use either the sval::Value or serde::Serialize implementations which have the API you're describing.
The kicker for the visitor API is the plus arrays and tables part. Once you have to support complex datastructures the API becomes a lot more complex and the open design space a lot bigger than just simple primitives. Not supporting arrays and maps isn't really an option because you certainly want to log them. So we offer two options in log for properly serializing values: there's sval for no-std serialization and serde for wider compatibility. Both can handle any shaped datastructures and require fairly large APIs to do so.
There is a lot of dynamic dispatch going on, and a lot of type machinery. Is all this necessary? If it is, then fine. However did you consider something like a HCons-style approach?
I'd be interested to see a HCons-style approach! My experience with H-lists in Rust has been that they're somewhat going against the grain of the language and so require a lot more type machinery, various hacks, and tend to produce large and incomprehensible error messages. One of the goals of log as well is to generate as little code in callsites as possible, ideally just a single function call. But that doesn't mean exploring different approaches isn't worthwhile! We settled on trait objects fairly early and never deeply considered trying to go the other way into fully generic code.
@KodrAus Thanks for replying.
In my opinion, since 'log' is supposed to be a lightweight logging crate, the key-value logging mechanism should be lightweight too. It doesn't need to support everything -- a superset of JSON and OpenTelemetry style values would be enough. It should be streamlined for dumping the data to a stream, since that's the most common operation.
In case it's of interest: My original reason to look into this was to add minimal key-value logging support to my Stakker crate. In the end I implemented my own minimal logging interface directly in Stakker, with the logging macros in a separate crate: stakker_log. This was done because none of the crates out there did quite what I needed. Also because I needed to transparently pass through a span ID (which is effectively an actor number) with the log records, to support span-based logging. Also, since Stakker is single-threaded, it's more efficient to log within the same thread, rather than to something which then has to use a Mutex or whatever. The intention is that the user-provided logging handler could either dump out directly, or forward logging to whatever logging crate the user prefers. This does also handle nested values (arrays and maps), although in a very minimal visitor style: just a closure at the point of logging which dumps the data as calls to a visitor closure which outputs it.