msgpack-rust
msgpack-rust copied to clipboard
Is it possible to get the number of bytes read immediately after deserializing from a slice?
I'm trying to use rmp_serde
to send and receive entire messages (enum
s or struct
s) via a BytesMut
buffer that gets populated and emptied by a different part of the system (in this case a TcpStream
).
I don't have any custom framing setup so I'm wondering if I could use rmp_serde
to tell me the exact size of the serialized representation of the data (i.e. the count of bytes it deserialized) immediately after it has successfully parsed a portion of the stream into the specified type.
If there already exists an approach I apologize for having missed it. Please feel free to point me in the right direction.
I'm picturing an API like:
/// Deserialize a slice into a deserializable data type and return a count of the bytes deserialized if the deserialization was successful.
pub fn from_slice_with_size<'a, T>(input: &'a [u8]) -> (Result<T, Error>, Option<usize>)
where
T: Deserialize<'a>
{
...
}
Here's an example usage:
use serde::{
Serialize,
Deserialize
};
#[derive(Serialize, Deserialize)
pub enum Foo {
Bar(String),
Baz
}
pub struct Container {
pub buffer: BytesMut
}
impl Container {
...
fn read_foo(&mut self) -> Result<Option<Foo>, Box<dyn std::error::Error>> {
if let Ok(foo) = rmp_serde::from_slice(&self.buffer) {
// Currently, to get the byte count I have to serialize it again.
// This has to be slower than keeping track of the bytes deserialized
// while deserializing.
let bytes_serialized = rmp_serde::encode::to_vec(&foo)?.len();
self.buffer.advance(bytes_serialized);
Ok(Some(foo))
}
}
/// This doesn't work because there is no `from_slice_with_size` method but
/// it'd be neat if there was something that keeps track
/// of and outputs the size of the bytes deserialized.
fn read_foo_with_size(&mut self) -> Result<Option<Foo>, Box<dyn std::error::Error>> {
if let (Ok(foo), Some(size)) = rmp_serde::from_slice_with_size(&self.buffer) {
self.buffer.advance(size);
Ok(Some(foo))
}
}
...
// Some other part takes a `&mut self` and populates the buffer.
fn fill_up_buffer(&mut self) {
stream.read_buf(&mut self.buffer).unwrap();
}
}
No. The serde API can only return the final result once it's 100% complete.
You could use the lower-level rmp to read individual items as they come. OR you could use something else around msgpack messages to split them into chunks (it could be as simple as sending <length><data>
pieces over the stream).
Hi!
I am serializing a series of structs, and writing them individually to a binary file.
Could you please show which function in rmp::decode
should I use to get the packed struct size? I tried several, and rmp::decode::read_map_len
seems to be the most suitable choice, however, it returns 1
, which is clearly not the case.
And could you please elaborate on the 'using something else around msgpack messages'? Do you suggest just writing custom bytes after each struct, and then splitting the data by that divider?
Thanks in advance.
UPD: as a workaround, one can simply write the size of serialized struct before the actual data:
// encoding (pseudocode)
writer.write(size.to_bytes());
writer.write(serialized_bytes());
// decoding
while buf.len() > 0 {
// firstly we read the size of packed data
// here you might want to use big/little endian, not native one
let size = usize::from_ne_bytes(buf[0..8].try_into().unwrap());
// trim the buffer so that it starts with actual data
buf = &buf[8..];
// parse the serialized record
let record = rmp_serde::from_slice::<Record>(&buf[..size]);
// cut out the data
buf = &buf[size..];
}