uuid
uuid copied to clipboard
Add `valuable` support for `Uuid`s
Motivation
The Tokio team is releasing a new crate, valuable
, which is intended to be used for passing in custom types as fields in logs. Since Uuid
s are commonly used in applications that also use tracing
like Kanidm, we should implement the Valuable
trait behind a feature gate.
Solution
Derive the Valuable
trait.
#[cfg_attr(feature = "valuable", derive(Valuable))]
struct Uuid(Bytes);
Alternatives
- Continue to record
Uuid
s as strings. - Create a wrapper type around the
Uuid
and rely on usingas_bytes
in the trait implementation. This would be incredibly annoying though, because it would mean applications would have to use:
let uuid = Uuid::new_v4();
tracing::info!(my_id = MyWrapper(uuid).as_value(), "the uuid");
Hopefully tracing
will add a sigil to reduce verbosity with as_value()
similar to how ?
and %
are used for Debug
and Display
formatting, but avoiding a wrapper type would be ideal.
Is it blocking? No
Anything else?
The valuable
crate is still in the very early stages, and could possibly have breaking changes. However, I think the very minimal API that this crate would depend on (just deriving the trait) is unlikely to change.
I've been expecting this for a while, so we've already got a bit of a plan in place 🙂 We've got zerocopy
as an example of bringing in unstable dependencies in the stable library. So we can do the same for uuid
.
As for representation, I think we'd probably be most useful implementing Valuable
manually with a text-based format (https://github.com/tokio-rs/valuable/issues/43#issuecomment-854111197), rather than deriving it and getting a list of raw bytes. The reason being users in that logging context can recognize a stringified UUID as being a UUID, but 16 bytes doesn't give any clues. What I'm not sure of is whether we can/should implement as_value
differently to visit
so that you get bytes from one and a string from the other:
impl valuable::Valuable for Uuid {
// If you ask for a value you get our byte representation
fn as_value(&self) -> valuable::Value {
valuable::Valuable::as_value(&self.0)
}
// If you visit you get a hyphenated string
fn visit(&self, v: &mut dyn valuable::Visit) {
v.visit_value(valuable::Value::Displayable(self.hyphenated()))
}
}
My gut feel on that though is "no that's a bad idea".
I think it would be better to visit Uuid
s as sequences of bytes, since the string representation is already possible in tracing
because any Debug
type can be passed in as dyn Debug
. However, there's no way to get the raw bytes. Visiting the bytes would allow crates like tracing-forest
to easily read in Uuid
s and pass them back to the application if desired without any extra overhead of formatting and parsing.
In terms of 16 bytes not representing the semantic meaning of a Uuid
: values passed to subscribers have an associated &'static str
name. If a subscriber is expecting a Uuid
, then the crate can specify which name to use. Otherwise, it can get passed to the dyn Debug
visitor.
So for the tracing-forest
use-case what you'd like to do is receive a value
, determine that it's a Uuid
by its associated name, and then use something like Uuid::from_slice
on the result of visit_primitive_slice
to capture that value until it's ready to be re-emitted? Relying on stringly-typed APIs to detect that you're looking at a UUID and converting it one way or another seems a little dubious to me, but it could be ok if that's the direction valuable
recommends.
I think part of the problem here may be that valuable
implies tracing
as a primary use-case, which in turn implies a preference for human-readable representations of your data, since you want it to be understood on the other end. If we were just considering valuable
support independently of the diagnostics usecase then using the raw 16-byte representation would be perfectly suitable. If tracing-forest
would need to match on Uuid
's associated name either way, then my preference would still be for using a text-based representation for UUIDs. That maximizes interoperability with tooling at the other end without forcing stringly-typed matches on all users, and the parser is not expensive. If you already need to potentially allocate maps and strings to cache values then parsing UUIDs isn't likely to have much extra impact.
I agree that stringly-typed APIs are kinda sus. However, the official blog post specifically uses an example to inspect an HTTP header type with very specific fields. Furthermore, valuable
allows for structs to have named fields, so we could make the Value
representation have the bytes as a named field for further checking. In terms of ergonomics, valuable
is intended to offer you a zero cost view into a type, and stringifying defeats this purpose: it should be up to the Subscriber
both a) if it even wants to format the Uuid
and b) which format to use.
Another detail to consider is that providing the bytes for a Valuable
implementation forces nothing new on tracing
users. It's just as easy to pass in the Uuid
as a dyn Debug
, which already does exactly what you're proposing and is stable.
Hmm, when would you expect an end-user of tracing
to capture a Uuid
using its Valuable
implementation over its Debug
? I was under the impression that tracing
may choose to replace their entire value API with valuable
in the future so if that’s the case then users won’t really have a choice to make between a human-readable Debug
and machine-readable Valuable
. Today I also wouldn’t expect anybody to make that choice anyway if the default compiles.
I think what’s missing from this discussion that’s going to keep us going in circles at the moment is more input on potential usecases for a valuable
implementation. It’s a brand new API so we don’t have access to a lot of that, but I’m just a bit reluctant to commit to a direction without a bigger picture to know what we gain or lose by opting for one representation over another.
Using tracing:
An idea I had for tracing-forest
was to allow for different contexts, so that disjoint concurrent tasks can map to entirely different contexts with their own log trees, without having to enumerate each contexts at compile time. This could use Uuid
s within the Subscriber
to map a log to its context, without ever emitting the ids of each context to the logs. If there were many logs, this would require many unseen and unnecessary formats.
Outside of tracing
To address your "going in circles" point: I think valuable
fills a very similar role to serde
, and will show up in similar places. Like valuable
, serde
also has an abstract introspection layer. This layer relies on a Serializer
to visit and translate it into a tangible format. On the other hand, valuable
is just the first half of this, an abstract introspection layer. Thus, the entire responsibility of the Serializer
is expected of the user of valuable::Value
s. Because of this, I claim that valuable
is not meant to serve as the end-of-the-line processing unit; it's an intermediate tool that is followed by more processing, just like how Serializer
s do more work on the data once it's recorded. This applies to tracing as well.
Uuid
's Serialize
implementation even makes the distinction that it is up to the Serializer
how the Uuid
should be serialized. The current Serialize
implementation looks like this:
if serializer.is_human_readable() {
serializer
.serialize_str(&self.to_hyphenated().encode_lower(&mut [0; 36]))
} else {
serializer.serialize_bytes(self.as_bytes())
}
Making the Valuable
trait return a string instead of the raw bytes would be analogous to implementing Serialize
like this (I didn't verify this compiles, but the idea is here):
// analogous to what's exposed by the `Valuable` impl
let value = self.to_hyphenated().encode_lower(&mut [0; 36]);
// analogous to what the downstream `valuable` user has to write
if serializer.is_human_readable() {
serializer
.serialize_str(value)
} else {
serializer.serialize_bytes(Uuid::parse_str(value).unwrap().as_bytes())
}
This example seems dumb, because all these decisions are already made known to us (we can call Serializer::is_human_readable
). However, valuable
basically gives us this introspection before any decisions are made, meaning that we should be as unassuming as possible, and therefore do the least amount of work possible.
Also, since this would be an unstable feature, wouldn't it still be okay to change the implementation down the line before it stabilizes if we decide we want to? If this is the case that I think we should try it this way and we can get feedback on it first.
Yeh, I can see the value of the role valuable
fills, and I think it's a great one. The issue we've got is that its introspection tools are great for compound values like maps, lists, and enums, but it doesn't give us much for primitive values like UUIDs, IP addresses, and such to describe their content in a way that can be recognized by a person and inspected programmatically. The Value::Path
variant seems to suggest a path where these types end up called out specially. I wouldn't want to suggest there should be a Value::Uuid
variant though. As much as I'd like to treat valuable
as an independent tool for what it is, I don't think we can really take a direction that penalizes the primary diagnostics use-case that tracing
suggests.
I was thinking we could just copy the approach used for IP addresses in UUID, but I see we haven't got them yet in valuable
. I think this is something we'll need some input from the valuable
maintainers on, because we might be one of the first libraries looking closely at how to implement it. I'll go and open an issue over in valuable
and cc you.
I've opened https://github.com/tokio-rs/valuable/issues/87 to get some input from the valuable
maintainers.
Coming back around to this, it looks like we don't have a clear best path forward, so as much as I'd like to keep the default implementation of valuable
useful for UUIDs in diagnostics I think the most direct representation of the data (that is, as bytes) is a reasonable starting point.
We can follow the same approach as zerocopy
and add an unstable valuable
feature to uuid
with an implementation.