msgpack-rust
msgpack-rust copied to clipboard
Deserialize raw bytes into Vec<u8>
Maybe I missed it, but I'm unable to deserialize a raw binary value as a Vec<u8>:
#[test]
fn some_test() {
let b = rmp_serde::to_vec(&rmpv::Value::Binary(vec![0u8; 16])).unwrap();
let v: Vec<u8> = rmp_serde::from_read_ref(&b).unwrap();
}
running 1 test
thread 'tests::some_test' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax("invalid type: byte array, expected a sequence")', src/lib.rs:503:26
stack backtrace:
< not relevant >
AFAICT, it's the same issue with fixed arrays.
The only work around is to use rmp_serde::Raw, but that seems odd.
i think, that's why serde_bytes exist.
Without specialization, Rust forces Serde to treat &[u8] just like any other slice and Vec
just like any other vector. In reality this particular slice and vector can often be serialized and deserialized in a more efficient, compact representation in many formats.
So you're serializing the data as bytes, that is because you're using Value::Binary and its serialize_bytes serialize impl. But then while deserializing you expect it to be a sequence of u8s (caused by the generic deserialize impl on Vec<T>), not bytes, that mismatch is causing the error you're seeing.
The problem you're having has nothing to do with rmp in particular, it's a generic serde problem, which can be workarounded by the mentioned serde_bytes crate.
would it be possible to use Any to handle Vec<u8> using the serde_bytes approach by default, or even as config in this crate?
https://doc.rust-lang.org/std/any/
i did some benchmarking, round-tripping Vec<u8> with and without serde_bytes through messagepack and found that the difference is huge, so huge that it's kind of dangerous how easy it is to forget to add serde_bytes
i'm seeing about 20 mb/s to round trip the generic way and 1-2 gb/s with serde_bytes
round_trip_bytes/GenericBytesNewType/0
time: [133.96 ns 135.28 ns 136.71 ns]
thrpt: [0.0000 B/s 0.0000 B/s 0.0000 B/s]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
round_trip_bytes/GenericBytesNewType/1
time: [209.19 ns 209.51 ns 209.78 ns]
thrpt: [4.5460 MiB/s 4.5519 MiB/s 4.5590 MiB/s]
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) low mild
1 (10.00%) high severe
Benchmarking round_trip_bytes/GenericBytesNewType/1000: Collecting 10 samples in estimated 5.0023 s (103k iterations round_trip_bytes/GenericBytesNewType/1000
time: [47.786 us 47.846 us 47.953 us]
thrpt: [19.888 MiB/s 19.932 MiB/s 19.957 MiB/s]
Benchmarking round_trip_bytes/GenericBytesNewType/1000000: Collecting 10 samples in estimated 5.2616 s (110 iteratio round_trip_bytes/GenericBytesNewType/1000000
time: [46.991 ms 47.062 ms 47.165 ms]
thrpt: [20.220 MiB/s 20.264 MiB/s 20.295 MiB/s]
Benchmarking round_trip_bytes/SpecializedBytesNewType/0: Collecting 10 samples in estimated 5.0000 s (20M iterations round_trip_bytes/SpecializedBytesNewType/0
time: [215.04 ns 215.77 ns 217.15 ns]
thrpt: [0.0000 B/s 0.0000 B/s 0.0000 B/s]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
Benchmarking round_trip_bytes/SpecializedBytesNewType/1: Collecting 10 samples in estimated 5.0000 s (17M iterations round_trip_bytes/SpecializedBytesNewType/1
time: [230.90 ns 231.50 ns 231.90 ns]
thrpt: [4.1124 MiB/s 4.1195 MiB/s 4.1303 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
Benchmarking round_trip_bytes/SpecializedBytesNewType/1000: Collecting 10 samples in estimated 5.0000 s (12M iterati round_trip_bytes/SpecializedBytesNewType/1000
time: [364.77 ns 365.29 ns 365.92 ns]
thrpt: [2.5451 GiB/s 2.5496 GiB/s 2.5532 GiB/s]
Benchmarking round_trip_bytes/SpecializedBytesNewType/1000000: Collecting 10 samples in estimated 5.0129 s (6820 ite round_trip_bytes/SpecializedBytesNewType/1000000
time: [726.08 us 728.63 us 730.26 us]
thrpt: [1.2753 GiB/s 1.2782 GiB/s 1.2827 GiB/s]