msgpack-rust
msgpack-rust copied to clipboard
Why serialize newtype struct as array?
https://github.com/3Hren/msgpack-rust/blob/master/rmp-serde/src/encode.rs#L375
Newtype is simply a wrapper around inner data, so it should encode/decode as inner data. You shouldn't really treat it as one-field struct IMO.
The same goes for newtype_variant: maybe it should be encoded as [index, value]? It would be the most compact representation.
Since there is no standard description how these types should be serialized using various encoders each crate does it on its own.
The best thing I can offer is a some kind of configurable Serializer.
@3Hren newtype struct is a special (type-system only) case, almost like in Haskell with it's newtype keyword. It's a type-system level wrapper around some other type. So it should be treated as a special case. serialize_newtype_struct has been added to Serde exactly for this purpose.
Consider some example:
struct MyVecWrapper(Vec<u32>);
enum MyEnum {
Variant1(MyVecWrapper),
Variant2(MyVecWrapper, MyVecWrapper),
}
MyEnum::Variant1(MyVecWrapper(vec![...])) would be serialized as [0, [[[...]]]], wouldn't it? That's quite a lot of overhead, while it could be simple [0, [...]] like with Haskell's data-msgpack package.
I got the point, thanks. Still, I need to check whether some of my crates has already tied with pack-as-array behaviour.
@3Hren could you please explain https://github.com/3Hren/msgpack-rust/blob/master/rmpv/src/ext/de.rs#L717 ?
newtype_variant is enum A {A(i32)}, right? So there's no way there could be T..., isn't it?
Probably a comment bug. Need to remember how it dammit works actually :).
What do you think should we do with unit structs/variants? Do nothing? Nil? Or leave serialization as an empty array?
May be make it configurable?
@3Hren Haskell's data-msgpack uses Nil representation for units. For unit variant we should use simple [code] encoding.
So:
// Nil
struct Unit
// [code]
enum Enum {
A,
B
}
I don't think configurations are required here.
Feels inconsistent. Why not encode unit variant with nil then:
// [code, nil]
enum Enum {
A,
B
}
But why? Unit struct would be encoded as Nil because it's the most compact representation while unit variants would be just [code] (the most compact would be code, but that one actually is inconsistent). Consider this example:
enum Enum {
A,
B(i32),
C(i32, i32),
}
// Enum::A - [0]
// Enum::B(0) - [1, 0]
// Enum::C(0, 1) - [2, [0, 1]]
This format is compact, unambiguous and separates unit variant, newtype variant and tuple (struct) variant. Structs would be consistent and unambiguous too:
struct A; // Nil
struct B(i32); // Value
struct C(i32, i32); // [value, value]
Consider:
enum Enum {
A, // [0]
B(i32), // [1, i32]
C((i32,)), // [2, [i32]]
D(i32, i32), // [3, [i32, i32]]
E((i32, i32)), // [4, [i32, i32]]
}
I feel this subjectively inconsistent. One can say that it's not. Why not to serialize D(i32, i32) as [3, i32, i32] without inner array and so on.
I don't see any inconsistensy here.
Why not to serialize D(i32, i32) as [3, i32, i32] without inner array and so on.
Newtype variant is a simple value wrapper, while struct/tuple variant is actually a structure. Think of a parser that should
- choose constructor (variant)
- parse value
For units it would end up with only choosing a constructor. For others (newtypes and structs/tuples) it would parse value too, which is a value for newtype variant, map/array for struct/tuple. Serde separates units, newtypes, tuples and structs, so we should just use that.
I just found out that a newtype struct is serialized as fixarray (0x91). I concur with @SX91 that newtypes are a special case and that they should be transparent. To quote from Serde docs:
Data formats are encouraged to treat newtype structs as insignificant wrappers around the inner value, serializing just the inner value. A reasonable implementation would be to forward to
value.serialize(self)
In the meantime, as a workaround, manually implementing Serialize and Deserialize for the newtype is probably the way to go, right?
Edit: Example for serializing a newtype containing an u8 array:
use serde::ser::{Serialize, Serializer, SerializeSeq};
impl Serialize for Cookie {
fn serialize<S>(&self, serializer: S) -> ::std::result::Result<S::Ok, S::Error>
where S: Serializer {
let mut seq = serializer.serialize_seq(Some(self.0.len()))?;
for byte in self.0.iter() {
seq.serialize_element(byte)?;
}
seq.end()
}
}
Deserializing is a bit more complicated.
@dbrgn I'd recommend using master branch for now as those changes are already there.
Yep, these are on master. I can't currently publish a new release, because I'm in the middle of developing a tool that tries to compile dependent crates and run their tests to figure out what will be broken after publishing (I know about semver, but sudden breaking isn't good). If there are - I feel that I need some kind of migration guide to be done before publishing.
I'd recommend using master branch for now as those changes are already there.
Ah, awesome!