async-std icon indicating copy to clipboard operation
async-std copied to clipboard

Better support for byte ordered reads and writes?

Open yoshuawuyts opened this issue 5 years ago • 6 comments

Something that came up today was the question how to read and write bytes with using a certain endianness. @goto-bus-stop replied in chat with the following:

let mut bytes = [0; 4];
input.read_all(&mut bytes)?;
let num = u32::from_le_bytes(bytes);

However they also pointed out that using byteorder one could do:

let num = input.read_u32::<LE>()?;

Which seems quite nice. A port of this functionality exists for Tokio in the form of tokio-byteorder. With support for the futures::io::{AsyncRead, AsyncWrite} currently in the works.

Design questions

People are currently already capable of reading and writing bytes with a certain endianness, without any issues. The hard parts are taken care of. However it doesn't quite feel ergonomic yet. So what I'm wondering is if we could perhaps improve the status quo here somewhat by providing support for this out of the box.

Writing bytes is fortunately already a one-liner:

use async_std::prelude::*;
use async_std::io::{self, prelude::*};

#[async_std::main]
async fn main () -> io::Result<()> {
    let mut stdout = io::stdout();
    stdout.write_all(&12_u16.to_le_bytes()).await?;
    Ok(())
}

Byteorder inspired

But reading bytes isn't yet. We could probably do better here, and I see a few options. The first is to follow byteorder's lead and add 16 methods on the Read trait, two Endianness enums, and a NativeEndian type alias:

use std::io::{self, Cursor, prelude::*, BigEndian};

#[async_std::main]
async fn main() -> io::Result<()> {
    let mut reader = Cursor::new(vec![2, 5, 3, 0]);
    assert_eq!(517, reader.read_u16::<BigEndian>().await?);
    assert_eq!(768, reader.read_u16::<BigEndian>().await?);
}

std inspired

Another option seems to be to add 48 new methods on the Read trait (3 endianness * 16 nums), and try to follow std's naming conventions more closely:

use std::io::{self, Cursor, prelude::*, BigEndian};

#[async_std::main]
async fn main() -> io::Result<()> {
    let mut reader = Cursor::new(vec![2, 5, 3, 0]);
    assert_eq!(517, reader.read_u16_be().await?);
    assert_eq!(768, reader.read_u16_be().await?);
}

using traits

The third option, and I have no idea if this works (we should test this) is to add two new methods on the Read trait, and a trait that we implement for all number types so we can be generic over them, and the method knows how to decode them:

use std::io::{self, Cursor, prelude::*, BigEndian};

#[async_std::main]
async fn main() -> io::Result<()> {
    let mut reader = Cursor::new(vec![2, 5, 3, 0]);
    assert_eq!(517_u16, reader.read_be_bytes().await?);
    assert_eq!(768_u16, reader.read_be_bytes().await?);
}

This last approach is somewhat iffy because it would show up in the function signature, which means we'd have to expose it (but wouldn't want people to implement it). Or we could make it a sealed trait, but I'm not a fan of doing that.

It seems like https://internals.rust-lang.org/t/pre-rfc-safe-transmute/11347 might be proposing a trait that could potentially cover this, but I'm unsure about the exact implications and relation to this. Maybe we should bring it up?

If we could find a way to make this work this would definitely be my preferred option, as it's easy to add a counterpart to Write as well (creating symmetry, and an even smaller one-liner). But that's a big if because there seem to be quite a few hurdles

Conclusion

I've talked about the current state of reading and writing bytes from async_std::io::{Read, Write}, and explored possible directions to improve this.

This is not something we need to find a solution for immediately, but it's something that if we can figure out it'll make writing certain programs easier for sure. Thanks!

yoshuawuyts avatar Nov 23 '19 16:11 yoshuawuyts

https://github.com/jonhoo/tokio-byteorder/pull/2

EDIT: Oops..Sorry..Didn't notice that you've already pointed this out

tekjar avatar Nov 23 '19 20:11 tekjar

From https://internals.rust-lang.org/t/pre-rfc-v2-safe-transmute/11431:

Transmute deals with in-memory data in-place, and thus does not have any provisions to perform translations between native endianness and non-native endianness.

So the traits from the transmute proposal won't work for us.

yoshuawuyts avatar Dec 07 '19 01:12 yoshuawuyts

The latest tokio (0.2.3) has support for reading/writing integers, they chose read_u32 and that's always network byte order (aka big endian) and there are no little endian variants.

sdroege avatar Dec 07 '19 07:12 sdroege

Implemented the trait-based design for std's Read / Write types in https://docs.rs/omnom:

use std::io::{Cursor, Seek, SeekFrom};
use omnom::prelude::*;

let mut buf = Cursor::new(vec![0; 15]);

// Write this u16 as little-endian bytes.
let num = 12_u16;
buf.write_le_bytes(num).unwrap();

buf.seek(SeekFrom::Start(0)).unwrap();

// Read a u16 from little-endian bytes.
let num: u16 = buf.read_le_bytes().unwrap();
assert_eq!(num, 12);

This feels like the right choice; very similar to std. Should be trivially portable to async-std as well.

yoshuawuyts avatar Dec 08 '19 04:12 yoshuawuyts

Oh also for the record: I wrote about this topic in long-form a while ago: https://blog.yoshuawuyts.com/byte-ordered-stream-parsing/

yoshuawuyts avatar Feb 17 '20 13:02 yoshuawuyts

is there any progress on this? Anything i can use to that has similar support as byteorder crate?

halvors avatar Dec 20 '22 22:12 halvors