serde
serde copied to clipboard
Using de/serialize_with inside of an Option, Map, Vec
These would be equivalent to Jackson's @Serialize(keysUsing=...) etc. Now that we have stateful deserialization, this can be implemented in a pretty straightforward way.
I believe we'd want to support map keys and values as well was seq and option values.
Ideally I would like to find an approach that composes better than keysUsing.
#[derive(Deserialize)]
struct S {
#[serde(deserialize_with = "my_key")]
key: K,
#[serde(deserialize_with = "my_value")]
value: V,
#[serde(???)]
opt_map: Option<Map<K, V>>,
}
#576 has an approach based on a helper for generating ordinary deserialize_with functions, rather than using a slate of new attributes.
This could be neat:
#[derive(Deserialize)]
struct S {
#[serde(deserialize_with = "my_key")]
key: K,
#[serde(deserialize_with = "my_value")]
value: V,
#[serde(deserialize_with = "my_opt_map")]
opt_map: Option<Map<K, V>>,
}
fn my_map<'de, D>(deserializer: D) -> Result<Map<K, V>, D::Error>
where D: Deserializer<'de>
{
deserialize_map_with!(my_key, my_value)(deserializer)
}
fn my_opt_map<'de, D>(deserializer: D) -> Result<Option<Map<K, V>>, D::Error>
where D: Deserializer<'de>
{
deserialize_option_with!(my_map)(deserializer)
}
Another possible composable approach:
#[derive(Deserialize)]
struct S {
#[serde(deserialize_with = "my_k")]
k: K,
#[serde(deserialize_with = "option!(my_k)")]
opt: Option<K>,
#[serde(deserialize_with = "option!(map!(my_k, my_v))")]
opt_map: Option<Map<K, V>>,
#[serde(deserialize_with = "map!(_, my_v)")]
map: Map<u8, V>,
}
This needs to support all the wrapper types too: Rc, Arc, Cell, RefCell, Mutex, RwLock.
Would a syntax like this also want to support custom key/values within a custom map deserializer?
Like
#[derive(Deserialize)]
struct S {
#[serde(deserialize_with = "my_k")]
k: K,
#[serde(deserialize_with = "option!(my_k)")]
opt: Option<K>,
#[serde(deserialize_with = "my_map")]
map: Map<u8, u8>,
#[serde(deserialize_with = "option!(map!(my_k, my_v))")]
opt_map: Option<Map<K, V>>,
#[serde(deserialize_with = "map!(_, my_v)")]
map: Map<u8, V>,
#[serde(deserialize_with = "map::<my_map>!(my_k, my_v)")]
map: Map<K, V>,
}
or would this not be feasible at all with the current Deserializer architecture?
Yes, probably by means of a trait that is implemented for all types that support map!, a different one that is implemented for types that support option!, etc.
hm, ok. Would that mean the my_map deserializer would have to return a type which implements some trait then?
I mean deserialize_with is often used with types from other libraries, where implementing a trait on the type isn't possible - just trying to hash out the problem here.
For a syntax alternative, what would you think of something like "inner" attribute, like this?
#[derive(Deserialize)]
struct S {
#[serde(deserialize_with = "my_k")]
k: K,
#[serde(inner(K, deserialize_with = "my_k"))]
opt: Option<K>,
#[serde(deserialize_with = "my_map")]
map: Map<u8, u8>,
#[serde(inner(K, deserialize_with = "my_k"))]
#[serde(inner(V, deserialize_with = "my_v"))]
opt_map: Option<Map<K, V>>,
#[serde(inner(V, deserialize_with = "my_v"))]
map: Map<u8, V>,
#[serde(deserialize_with = "my_map")]
#[serde(inner(K, deserialize_with = "my_k"))]
#[serde(inner(V, deserialize_with = "my_v"))]
map: Map<K, V>,
}
If this kind of syntax would be allowed in attributes, and if it relatively matched what we can feasibly make happen, it would also provide an intuitive way to include #914:
#[derive(Deserialize)]
struct S {
#[serde(inner(Cow<'a, str>, borrow))]
cow: Vec<Cow<'a, str>>,
}
Edit: again, thinking entirely " ideal situation" here, but could we have this attribute support literally all field attributes by having the Deserialize impl create a newtype for each #[inner] clause with the inner attributes?
If I understand the situation correctly, having it create a newtype would let any inner attributes apply to the newtypos single field, and the changes in deserialization could be fully conveyed statically with no cost.
Thoughts?
Or... any obvious contradictions which I completely overlooked which would invalidate this?
Sorry, I think I was just on the complete wrong track there. My apologies for not researching how all of this actually works and thinking about it before commenting!
I think I can agree now that a trait is probably the best way to do this.
I'm pretty new to Rust and Serde, but since you asked (in e.g. #999 and #1005) for people's thoughts on a design, here's how I would like this to work as a user:
mod remote { // the remote crate
struct Foo { // the remote struct I'm using
...
}
}
////////////////
mod my { // my crate
#[derive(Serialize, Deserialize)]
struct FooDef { // a struct with identical fields to Foo
...
}
// A macro invocation that emits some declarations that make
// it so that, within this module, _every_ use of Foo is serialized
// as if it had been annotated with #[serde(with = "FooDef")]
serde_serialize_as!(Foo, FooDef);
#[derive(Serialize, Deserialize)]
struct MyStruct {
field1: Foo, // works, no additional annotation necessary
field2: Option<Foo>, // also works
field3: HashMap<Foo, String> // also works
}
}
I don't know enough about Rust macros and Serde to say whether this is actually implementable.
I guess no design has been decided yet? I have a struct that has a Option<toml::value::Datetime> but I'd like to store it as a String instead since I don't want to care about TOML once it's loaded.
fn from_toml_datetime<'de, D>(deserializer: D) -> StdResult<Option<String>, D::Error>
where
D: Deserializer<'de>,
{
toml::value::Datetime::deserialize(deserializer)
.map(|s| Some(s.to_string()))
}
was my attempt at #[serde(deserialize_with = "from_toml_datetime")]
The above was my intuition so I would vote for something like that, if it's doable.
Is there a way to do that at all currently?
@Keats that looks like it should work correctly. Is a field marked with #[serde(default, deserialize_with = "from_toml_datetime")] not working?
Edit: just realized you included the attribute. I would recommend also adding #[serde(default)] to handle the case where it's not there - this is implied regularly, but is not when there's a custom deserialize like this.
I was missing the default, that did the trick. Thanks!
What does anyone think of including non-(de)serialize_with attributes in this issue?
I'm currently running into a situation where it would make sense to use a HashMap<i64, Cow<'a, [u8]>> with all inner Cow's borrowed. However, #[serde(borrow)] only affects the outermost type, just like (de)serialize_with.
Is there any way to workaround it? I have a struct like:
struct Snippet {
annotations: Vec<Annotation>
}
#[derive(Deserialize)]
#[serde(remote = "Snippet")]
struct SnippetDef {
#[serde(with = "AnnotationDef")]
annotations: Vec<Annotation>
}
and that of course doesn't work. How should I solve it until this is fixed?
@zbraniecki
My understanding is that you'd work around it by making a method, roughly:
fn deserialize_annotation_vec<'de, D>(deserializer: D) -> Result<Vec<Annotation>, D::Error> {
struct AnnotationVecVisitor;
impl<'de> Visitor<'de> for AnnotationVecVisitor {
type Value = Vec<Annotation>;
fn expecting(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "a list of annotations") }
fn visit_seq<A: SeqAccess<'de>>(self, seq: A) -> Result<Vec<Annotation>, A::Error> {
let mut vec = Vec::with_capacity(cmp::min(seq.size_hint().unwrap_or(0), 4096));
while let Some(v) = seq.next_element()? {
// assert type
let v = <AnnotationDef as Into>::into(v);
vec.push(v);
}
Ok(vec)
}
}
deserializer.deserialize_seq(AnnotationDefVisitor)
}
Then you'd use #[serde(deserialize_with = "deserialize_annotation_vec"] on the field to use this implementation.
Sources:
- deserialize method: https://serde.rs/custom-date-format.html
- visitor: https://serde.rs/impl-deserialize.html
- Vec visitor impl (for body of
visit_seq): https://github.com/serde-rs/serde/blob/master/serde/src/de/impls.rs#L625
I would implement it as:
use serde::{Deserialize, Deserializer};
#[derive(Deserialize)]
#[serde(remote = "Snippet")]
struct SnippetDef {
#[serde(deserialize_with = "vec_annotation")]
annotations: Vec<Annotation>,
}
fn vec_annotation<'de, D>(deserializer: D) -> Result<Vec<Annotation>, D::Error>
where
D: Deserializer<'de>,
{
#[derive(Deserialize)]
struct Wrapper(#[serde(with = "AnnotationDef")] Annotation);
let v = Vec::deserialize(deserializer)?;
Ok(v.into_iter().map(|Wrapper(a)| a).collect())
}
Thank you! That works great!
From upcoming Rust 1.28 release notes:
Attributes on generic parameters such as types and lifetimes are now stable. e.g.
fn foo<#[lifetime_attr] 'a, #[type_attr] T: 'a>() {}
I suppose it should make possible for serde_derive to fix this issue now?
I believe those were stabilized already in 1.27.0. I don't see how it would apply to this issue though.
Hmm, you're right, I misunderstood what it does. Allowing them on generic params doesn't mean getting attributes for these params from instantiation sites in a generic one.
Still no solution or workaround for the Option<> case??
I'm doing his because I have a lot of Vec<u8>s that are encoded as hex. So I use deserialize_with = "deserialize_hex" with a custom hex method. However, now I have Option<Vec<Vec<u8>>> and I lost it all. I made a new deserialize_hex_array which can decode Vec<Vec<u8>>, but with the option it stops working. I tried having deserialize_hex_arrayreturn the Option<Vec<Vec<u8>>>, but still it didn't seem to work. I still got "missing field".
@stevenroose I know this is another workaround, but still. For missing fields like that I've found using #[serde(default)] along with the "option" deserialize_with works well.
I have a ton of fields that are #[serde(default, deserialize_with = "option_timestamp")].
I invested some time yesterday to tackle this and maybe I am on something. Potentially, this is my first contribution to Rust community... I am excited !
TL;DR
I have a proof-of-concept implementation here: Gist (do not be frightened by the amount of code). The procedural macro has not been altered to support as/serialize_as/deserialize_as attributes, but this is easy part, I think.
The idea I have may enable us to write:
-
#[derive(Serialize, Deserialize) struct SomeTime { #[serde(as = "Option<chrono::DateTime<chrono::Utc>>")] stamp: Option<chrono::NaiveDateTime>, }The above example is useful when
NaiveDateTime(UTC) is used internally and the conversion between ISO8061 is needed only at API side. -
#[derive(Serialize, Deserialize) struct SomeTime { #[serde(as = "Vec<Hex>")] bytes: Vec<Vec<u8>>, }Note how type nesting plays nicely here.
-
Any container can be supported:
#[derive(Serialize, Deserialize) struct SomeTime { #[serde(as = "HashMap<i32, Option<chrono::DateTime<chrono::Local>>>")] map: HashMap<i32, Option<chrono::NaiveDateTime>>, }
As you can see:
- the syntax is easy
- the intent is very clear.
- the composition is great. You can nest types however you want.
Details
Often one wants to serialize a struct field using specific format.
Changing a type of this field is often undesirable, because it requires more code to pack new type and unpack original type from new type. And serde's job is nothing but conversions. The sad story is serde doesn't help us in this specific conversion.
As many already found serialize_with and deserialize_with have shortcomings.
They do not compose well and we quickly come into explosion of conversion functions for each variant
Option<T>, Vec<Option<T>>, HashMap<K, Option<V>> and so on.
With Deserialize and Serialize traits, we do not have the ability to encode how nested types should be serialized.
What I propose is to add new traits with such an ability:
/// Deserialize `T` as it would be `Self` type.
pub trait DeserializeAs<'de, T>: Sized {
fn deserialize_as<D>(deserializer: D) -> Result<T, D::Error>
where
D: Deserializer<'de>;
// TODO: deserialize_as_into
}
/// Serialize `T` as it would be `Self` type.
pub trait SerializeAs<T> {
fn serialize_as<S>(source: &T, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer;
}
This can be implemented between any two types we want, similar to From trait.
In the Gist you can find some implementations, implementations for standard collections are also included.
The good part is any crate can implement this trait for:
- non-standard collections/containers
- usual conversions (e.g.
chrono::NaiveDateTime->chrono::DateTime,Vec<u8>->Hex,Vec<u8>->Base64)
By doing this we can share more code within community.
The traits are implemented for more specific type, not the other way around. Otherwise one would not be able to implement conversion for its own type due to coherence rules:
impl SerializeAs<Hex> for String { .. } // would not compile
impl SerializeAs<String> for Hex { .. }
The above is compatible with the order of From trait.
This is a big change and I am not sure whether this is preferred way of implementing this feature. If it is, I definitely want to take this forward. But I will need help, some guidance.
I thought about externalizing this change to serde-as crate, but I think this would
reduce the adoption.
SameAs
There is one caveat, though.
If leaf type does not change, it does not compile.
The example with HashMap is an instance of this problem, where i32 key does not change.
The problem arises, because we can't implement:
impl<T: Serialize> SerializeAs<T> for T { .. }
We can't implement it, because the compiler cannot prove the implementations are not overlapping.
That's why I introduced SameAs structure, as a workaround.
There are several solutions:
- resolve https://github.com/rust-lang/rfcs/issues/1834
- implement traits for for all types manually
- the most viable solution right now:
- move
SameAstoprivatemodule - detect equivalence of leaf types in procedural macro (is it even possible to parse generic types ?).
I know this will easily break as we cannot assume
Stringis equal tostd::string::String, but the chance one would write different types is low.
- move
- make
SameAstruly public and enforce everyone to writeserde(as = HashMap<SameAs<i32>, ..>)
This looks promising! I like how well it composes for nested types.
I don't necessarily want to start by landing new traits and derive attributes in serde immediately. It would be better if all of this could be provided in a separate crate for now and we can iterate on the design.
For the attributes maybe you can rely on with for now:
use ???::As;
...
#[serde(with = "As<Vec<Hex>>")]
where As is a generic type that has serialize/deserialize methods as expected by with defined in terms of SerializeAs/DeserializeAs.
If someone already has a deserialize_with function (such as this / this, drawing arbitrarily from GitHub search results), could you show what they would need to write to make it work within this approach?
As struct is great idea, I've tested it and there is only one little quirk.
It will be written as #[serde(with = "As::<Vec::<Hex>>")], because serde expects path in this position.
I am going to write a new crate, yes!
The referenced stringly_array_spaces function could be rewritten without the use of Visitor. So let me do it first:
pub fn stringly_array_spaces<'de, D>(deserializer: D) -> Result<Vec<String>, D::Error>
where
D: Deserializer<'de>,
{
let s: &str = Deserialize::deserialize(deserializer)?;
Ok(s.split_whitespace().map(|x| x.to_owned()).collect())
}
Translating this directly as it is to DeserializeAs would give us:
struct SpaceSeparatedStrings;
impl<'de> DeserializeAs<'de, Vec<String>> for SpaceSeparatedStrings {
fn deserialize_as<D>(deserializer: D) -> Result<Vec<String>, D::Error>
where
D: Deserializer<'de>,
{
let s: &str = Deserialize::deserialize(deserializer)?;
Ok(s.split_whitespace().map(|x| x.to_owned()).collect())
}
}
This only give us the ability to nest Vec<String> inside containers, e.g. Option.
I would like to generalize this to take any parsable type (a type that implements FromStr).
struct SpaceSeparated<T>(PhantomData<T>);
impl<'de, T> DeserializeAs<'de, Vec<T>> for SpaceSeparated<T>
where
T: std::str::FromStr,
T::Err: fmt::Display,
{
fn deserialize_as<D>(deserializer: D) -> Result<Vec<T>, D::Error>
where
D: Deserializer<'de>,
{
let s: &str = Deserialize::deserialize(deserializer)?;
s.split_whitespace()
.map(|x| T::from_str(x).map_err(<D::Error as serde::de::Error>::custom))
.collect()
}
}
But what I'm really missing here is a kind of FromStrAs to be able to fully utilize this idea.
Because, all we got now is only ability for outer nesting (inside other containers), but there is no way to redefine inner type, e.g.:
struct MyStruct {
#[serde(as = "SpaceSeparated<Hex>")]
member: Vec<Vec<u8>>,
}
As the consequence SpaceSeparated, as it stands now, doesn't even need to be parametrized by T.
@markazmierczak please publish the crate!
I've been working around this problem for a while by writing deserializer functions for every combination of container-foreign type that I use. The As idea looks very promising!
If I understand that right, that sounds pretty neat!
Would it allow me to do this?
// I have `Schedule` type that implements `Display` and `FromStr`.
#[derive(Serialize, Deserialize)]
struct Data {
#[serde(with = "As<HashMap<Schedule, String>>")]
cron_jobs: HashMap<String, String>
}
I would love to use that in my project right now. :slightly_smiling_face:
I tested the code a bit and so far it felt quite nice to use. I'm considering making this the basis for my crate serde_with, in which I already have different serde helpers. I'm tracking the progress at https://github.com/jonasbb/serde_with/issues/87
A couple of notes though. The type SameAs<T> is more complicated than needed. A Same without explicit generics also works, but it might be nice to be more explicit.
While a generic implementation like
impl<T: Serialize> SerializeAs<T> for T { .. }
is not possible, it is possible to implement it for every T manually, thus maybe avoiding the need for SameAs<T>/Same in common cases like
impl SerializeAs<i32> for i32 { .. }
EDIT: serde_with v1.5.0 includes this now in a usable manner.