Add `FromBytes`/`IntoBytes` methods which read from/write to an `io::Read`/`io::Write`
Progress
- [ ] Add
FromBytes::read_from_ioandIntoBytes::write_to_ioin 0.8 - [ ] In 0.9, perform the following renames:
-
IntoBytes::write_to->write_to_bytes(to be consistent withFromBytes::read_from_bytes) - Rename
FromBytes::read_from_io->read_fromandIntoBytes::write_to_io->write_to - Any similar rename for
TryFromBytesmethod if #2619 (or an equivalent PR) lands
-
Original text
Crosvm has a utility function called zerocopy_from_reader:
pub fn zerocopy_from_reader<R: io::Read, T: FromBytes>(mut read: R) -> io::Result<T> {
// Allocate on the stack via `MaybeUninit` to ensure proper alignment.
let mut out = MaybeUninit::zeroed();
// Safe because the pointer is valid and points to `size_of::<T>()` bytes of zeroes,
// which is a properly initialized value for `u8`.
let buf = unsafe { from_raw_parts_mut(out.as_mut_ptr() as *mut u8, size_of::<T>()) };
read.read_exact(buf)?;
// Safe because any bit pattern is considered a valid value for `T`.
Ok(unsafe { out.assume_init() })
}
Maybe we should add something similar to FromBytes (and potentially a write analogue to AsBytes)?
@nwplayer123 has also requested this feature.
The obvious name, IMO, for this is read_from, but that name is currently squatted by a #[doc(hidden)] and #[deprecated].
Since it's marked #[doc(hidden)], it doesn't have any SemVer obligations, but in the interest of providing a smooth transition for users, perhaps we wait a release or two before revising it?
If this is the course we take, we should update its deprecation message to something like this:
#[deprecated(since = "0.8.0", note = "renamed to `FromBytes::read_from_bytes`. Migrate immediately. This item will soon be re-used for a different routine.")]
Not sure what to call the _from_prefix variant, since read_from_prefix is already used, too (except for a visible, not deprecated function).
A few considerations:
- I don't mind
read_from_io_reador similar. It's weird, but also consistent with our naming convention ofread_from_<place> - I don't think we need prefix/suffix variants since a reader vends single bytes at a time, and so it can always give us exactly
size_of::<Self>()bytes - We will want a
read_from_with_elemsforSelf: ?Sized + KnownLayout - We may also want an analogous
IntoBytes::write_tomethod. In that case, perhaps we should rename our existingwrite_totowrite_to_bytesfor consistency?
- I don't think we need prefix/suffix variants since a reader vends single bytes at a time, and so it can always give us exactly
size_of::<Self>()bytes
There's an oddity about Read here in that it provides read_to_end so it's quite possible to distinguish between the read-all and read-prefix cases.
IMO it's fine to provide the minimal API and let users manually construct fancier use cases. All of this can be built safely using our existing primitives anyway.
- I don't mind
read_from_io_reador similar. It's weird, but also consistent with our naming convention ofread_from_<place>
Throwing out the option of read_from_stream, since the Read trait embodies a stream of bytes that you can just pull from instead of a discrete byte slice.
It would be helpful to consider TryFromBytes here too.
For example, I have this enum that represents compression:
#[repr(u8)]
pub enum Compression {
Stored,
Zlib,
BZip2,
LZMA1,
LZMA2,
}
Currently, I use FromRepr from strum to get an enum from the byte:
let compression = reader.read_u8()?;
header.compression = Compression::from_repr(compression)
.ok_or_else(|| eyre!("Unexpected compression value: {compression}"))?;
Now that we have TryFromBytes, I could use that instead. However, as I can't just zero the type and write data to it like you could do when using FromBytes, it requires me to read the data into a buffer, and then use TryFromBytes:
let mut buf = [0; size_of::<Compression>()];
reader.read_exact(&mut buf)?;
Compression::try_read_from_bytes(&buf)
This is now several lines every time I want to read an enum. It's difficult to make this generic as the below function fails to compile with constant expression depends on a generic parameter (@joshlf pointed out #248 here). Using vec![] works but then it's not allocated on the stack.
fn enum_value<T: KnownLayout + TryFromBytes + Sized>(reader: &mut impl Read) -> Result<T> {
let mut buf = [0; size_of::<T>()];
reader.read_exact(&mut buf)?;
T::try_read_from_bytes(&buf)
}
I've also considered a macro, which does work, but it's not syntactically as nice as it could be.
macro_rules! enum_value {
($reader:expr, $ty:ty) => {{
let mut buf = [0; size_of::<$ty>()];
$reader.read_exact(&mut buf)?;
<$ty>::try_read_from_bytes(&buf)
}};
}
I've considered using try_transmute!() but it still requires me to read the correct size of the data and ultimately, it would be the easiest for me if there was a way to interpret bytes straight from the reader.
More prior art: The ReadBytesExt and WriteBytesExt traits from the byteorder crate as mentioned in https://github.com/google/zerocopy/issues/438#issuecomment-2451702061.
I find myself doing this recently to mimic read_from_io:
pub fn try_read_from_io<R>(mut src: R) -> io::Result<Self>
where
Self: Sized,
R: io::Read,
{
let mut buf = [0; size_of::<Self>()];
src.read_exact(&mut buf)?;
Self::try_read_from_bytes(&buf)
.map_err(|err| io::Error::new(io::ErrorKind::InvalidData, err.to_string()))
}
It has to be implemented for every struct I want to use it for though so I'd love to see a try_read_from_io function (or try_read_from if the names are going that way in 0.9).
I've put up #2619, which is a draft implementation of TryFromBytes::try_read_from_io.