Streaming API
Are there any plans for a streaming API? The ability to serialize/deserialize impl Read and impl Write.
I want to be able to deserialize from a TCPStream.
bincode and postcard both support this.
Are there any plans for a streaming API? The ability to serialize/deserialize impl Read and impl Write.
For our use case we don't need this feature so I am hesitant to add and maintain it.
From an API perspective it would involve duplicating encode into encode_into(w: &mut impl Write, t: &impl Encode) -> Result<(), Error> and decode into decode_from<T: Decode>(r: &mut impl Read) -> Result<T, Error>.
From an internal code perspective it would need to avoid regressing the current performance without duplicating too much code.
I want to be able to deserialize from a TCPStream.
Can you read from the TCPStream into a Vec<u8> and pass that to bitcode?
Are your messages too large that they would consume too much memory? I kind of doubt this because serialized bitcode typically consumes less memory than the deserialized type does.
Can you read from the TCPStream into a Vec
and pass that to bitcode?
Yes but that vector could include multiple structs. Postcard has a method take_from_bytes which returns the slice of unused bytes.
From an internal code perspective it would need to avoid regressing the current performance without duplicating too much code.
I was worried that streaming wasn't possible because bitcode relied on knowing where the serialized data ends. In hindsight I realize that doesn't make sense.
that vector could include multiple structs
We work exclusively with WebSockets which provide their own framing of messages. The easiest way to use bitcode on a raw TcpStream might be to transmit the length (e.g. a 4 byte unsigned integer in network endian) and then the bytes from bitcode.
that vector could include multiple structs
We work exclusively with WebSockets which provide their own framing of messages. The easiest way to use
bitcodeon a rawTcpStreammight be to transmit the length (e.g. a 4 byte unsigned integer in network endian) and then the bytes frombitcode.
Yes, that's a good solution thanks. But first I might take a crack at modifying the bitcode codebase to allow for reading a slice partially or reading from a stream.
I think having a way to pack multiple types into one big packet is quite an essential feature, when encoding I guess we can just .extend_from_slice() on the slice from Buffer::encode. But there seems to currently be no way to read multiple messages packed together without including the length of each message (which afaict would be redundant information).
In my usecase (sending game data in UDP packets) I currently use a Cursor and decode messages (using bincode) in a loop until it consumed the entire packet, but even something as simple as getting (T, usize) as a return value, where usize is the number of bytes that were decoded, would be enough.
but even something as simple as getting (T, usize) as a return value, where usize is the number of bytes that were decoded, would be enough.
This would still result in redundant information since each message would be padded to the nearest byte.
Well that's a bummer
This would still result in redundant information since each message would be padded to the nearest byte.
It's necessary for TCP streams which don't support transmitting fractional bytes, unless the end of each message waited until the start of the next message.
Well that's a bummer
To be clear, a streaming API in the sense of impl Read + Write is not planned due to performance and compatibility issues.
We're considering an API that allows you to:
- append messages with minimal padding and no 'length' field
- decode the prefix of received data as a message and know where the decoder left off
This would slightly reduce the overhead of using bitcode in a stream-like context.
Edit: Closing this issue may have been premature. I've reopened it until there is an issue more focused on what we can actually implement.
decode the prefix of received data as a message and know where the decoder left off
❤️
New version of bitcode https://github.com/SoftbearStudios/bitcode/pull/19 has the potential to add streaming without high overhead.
Looks like the code actually would work great with streaming APIs, if only the codec mod was publicly available. Or, rather, the View and Decoder traits for decoding.
I'd have a loop with roughly this:
- Try
T::populate(1)on the buffer; - Fails? Read and buffer more data;
- Works? Decode and return the value, advance the buffer to match what
T::populatedid to the slice we gave it.
Note that this way bitcode crate does not do any IO itself - external code would be responsible for that. This is the way I'd recommend doing it, as async exists and there everyone has their own traits for read/write ops.
I'd have a loop with roughly this:
- Try
T::populate(1)on the buffer;- Fails? Read and buffer more data;
- Works? Decode and return the value, advance the buffer to match what
T::populatedid to the slice we gave it.
You could achieve the same effect by length prefixing your messages. If your messages are long, this shouldn't add much overhead. If your messages are short and you encode them one at a time, bitcode won't provide any benefit over bincode.
Note: If your use-case is packing multiple small messages into a UDP packet see bitcode_packet_packer. It's able to encode multiple messages at once, but produces discrete packets that don't exceed a limit. Included is a benchmark of various techniques including encoding messages one at a time.
This is, obviously, a workaround that is universal and well know, and it what I'm using currently. I'm interested in specifically bitcode to provide this functionality - not because there's no other way but rather because bitcode already has everything that is required to do it.
Thanks for sharing the bitcode_packet_packer.
My use case is passing data over WebTransport streams - and currently it is for an example app. It is, basically, sending a packet and waiting for a reply, very old fashioned state machine on both ends without the need to pack multiple messages at once.
I'm currently using the tokio_util::codec::length_delimited::LengthDelimitedCodec - but just that, without the rest of FramedCodec infrastructure, as we don't use tokio io types.