IntoCtx/TryIntoCtx on borrowed values
The trait signatures are:
/// Writes `Self` into `This` using the context `Ctx`
pub trait IntoCtx<Ctx: Copy = (), This: ?Sized = [u8]>: Sized {
fn into_ctx(self, &mut This, ctx: Ctx);
}
/// Tries to write `Self` into `This` using the context `Ctx`
pub trait TryIntoCtx<Ctx: Copy = (), This: ?Sized = [u8]>: Sized {
type Error;
fn try_into_ctx(self, &mut This, ctx: Ctx) -> Result<usize, Self::Error>;
}
These traits mirror Into which consumes self. However, these might be the wrong function signatures for how scroll is used in practice; both traits are primarily used on borrowed values.
scroll implements IntoCtx and TryIntoCtx on borrowed primitive types, suggesting that [Try]IntoCtx needn't consume its value after all.
scroll_derive generates code implementing IntoCtx and TryIntoCtx on Self by borrowing and delegating to the implementation on &'a Self.
Should these traits take &self instead of self?
So I considered this a while ago since we don’t technically consume the object, but I made it take an owned value because it seemed “right”.
That being said are there any downsides to switching to a non owned value ? I don’t think we can do very many optimizations unless it’s a copy value, but then we don’t need the owned value anyway?
Anyone else have thoughts ? (Besides this being another breaking change, willglynn is busting up scroll 🤣)
For the record I’m pretty much ok with this I think, unless I’m missing something. This will be a pretty annoying refactor though in goblin. I’ll also see how it affects faerie (which is the primary consumer of writing goblin structs), but I even recall remembering having to clone because of scroll (eg I wrote the same thing multiple times)
@luser @philipc anyone else thoughts ?
An alternative to note is to add another tryinto method but I suppose we can always do that after (and add another pwrite method); but we can’t do the reverse after the crate is 1.0
I've pondered this a bit since my initial writeup, and I can now distill my argument:
- The benefit of
Intoconsumingselfis thatinto()can move fields from the old type the new type. [Try]IntoCtxalways goes into a&mut [u8]– a mutable reference to a caller-provided slice – so even in the ideal case when the originating type was a[u8]we'd need to copy it into the caller's slice.- Since this trait must always make a copy, it should take
&selfinstead of consumingself.
The breakage of this change is worth considering, but it doesn't seem like it'll be too bad. scroll_derive changes do most of the work, so it's just adding a & to whichever pwrite() calls the compiler doesn't like. I found many pwrite()s already writing borrowed values since that works right now, so it wasn't even every call needing to get touched.
scrollimplementsIntoCtxandTryIntoCtxon borrowed primitive types, suggesting that[Try]IntoCtxneedn't consume its value after all.
I added those in https://github.com/m4b/scroll/pull/35 in service of https://github.com/m4b/scroll_derive/pull/17. I think this is another one of those design decisions that makes sense when your use cases are "structs full of primitive types" where everything is Copy, but it stops working well when you have non-Copy types. I ran into this in my minidump crate because I removed #[derive(Copy)] from a bunch of structs since they were actually fairly large and it didn't feel right.
The generated TryIntoCtx impls in scroll_derive don't deconstruct the self argument anyway, they just use it field-by-field, so I don't know that this will effectively make a difference. It should be easy enough to build the example sources before and after making this change and run them through something like cargo asm to see what the generated code looks like.
I agree that these traits should take &self, for the reasons that @willglynn gives. It should only consume self if there are benefits from reusing parts of self for the conversion, but writing to &mut [u8] never does that.
I'm working on a PR and hit this:
error: borrow of packed field is unsafe and requires unsafe function or block (error E0133)
--> tests/api.rs:263:27
|
263 | bytes.cwrite_with(&self.foo, 0, ctx);
| ^^^^^^^^^
|
note: lint level defined here
--> tests/api.rs:5:9
|
5 | #![deny(safe_packed_borrows)]
| ^^^^^^^^^^^^^^^^^^^
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
= note: for more information, see issue #46043 <https://github.com/rust-lang/rust/issues/46043>
= note: fields of packed structs might be misaligned: dereferencing a misaligned pointer or even just creating a misaligned reference is undefined behavior
Context:
#[repr(packed)]
struct Bar {
foo: i32,
bar: u32,
}
// …
impl scroll::ctx::IntoCtx<scroll::Endian> for Bar {
fn into_ctx(self, bytes: &mut [u8], ctx: scroll::Endian) {
use scroll::Cwrite;
bytes.cwrite_with(&self.foo, 0, ctx);
bytes.cwrite_with(&self.bar, 4, ctx);
}
}
Like IntoCtx, my local Cwrite trait now takes &self. Both traits no longer copy while writing, and borrowing potentially-misaligned packed fields isn't safe, so there's a new tension here.
In this case, the #[repr(packed)] struct will have the same structure whether it's packed or not, so we could remove the annotation. In other cases, the user might need to Copy their fields for serialization:
impl scroll::ctx::IntoCtx<scroll::Endian> for Bar {
fn into_ctx(&self, bytes: &mut [u8], ctx: scroll::Endian) {
use scroll::Cwrite;
let Self{foo, bar} = *self;
bytes.cwrite_with(&foo, 0, ctx);
bytes.cwrite_with(&bar, 4, ctx);
}
}
Backing up a bit more: how should #[repr(packed)] and scroll fit together?
My thoughts: we're working with bytes a field at a time, so the layout of the Rust data structure doesn't affect the binary input/output. If you're on a platform that can't do unaligned access, it seems like we'd want to handle that during the serialization step. This makes me think the Rust struct should always have optimal layout (i.e. not be #[repr(packed)]), pushing all non-aligned accesses into scroll's read/write operations which already support that.
I agree that #[repr(packed)] is unnecessary for scroll usage and we should advise against it. I was originally using packed and transmuting bytes in my minidump crate, but I removed that and switched everything to scroll and it's been working well. Since I changed the scroll_derive SizeWith impl to call itself recursively for every struct member it should produce the right size as if the struct was packed.
Hypothesis: #[repr(packed)] is only useful with Copy types.
#[derive] imposes such a requirement on the struct itself:
#[derive(Debug,PartialEq,Hash)]
#[repr(packed)]
struct Bar {
foo: i32,
bar: u32,
}
error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
--> tests/api.rs:209:10
|
209 | #[derive(Debug,PartialEq,Hash)]
| ^^^^^
|
error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
--> tests/api.rs:209:16
|
209 | #[derive(Debug,PartialEq,Hash)]
| ^^^^^^^^^
|
error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
--> tests/api.rs:209:26
|
209 | #[derive(Debug,PartialEq,Hash)]
| ^^^^
|
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
= note: for more information, see issue #46043 <https://github.com/rust-lang/rust/issues/46043>
If we're willing to require Copy on the struct fields, then there's a clear way forward. We need to copy each fields out of the #[repr(packed)] struct so we can safely & the copies. Turns out we need only some curly braces:
impl scroll::ctx::IntoCtx<scroll::Endian> for Bar {
fn into_ctx(&self, bytes: &mut [u8], ctx: scroll::Endian) {
use scroll::Cwrite;
bytes.cwrite_with(&{self.foo}, 0, ctx); // look ma, anonymous locals!
bytes.cwrite_with(&{self.bar}, 4, ctx);
}
}
This is easy enough to handle in scroll_derive – check if the struct is #[repr(packed)], emit extra braces if so. This leaves us three cases for #[derive(Pwrite,IOwrite)]:
- Normal structs get each fields written by reference. ✅
#[repr(packed)]structs withCopyfields get each field copied as we write them. ✅#[repr(packed)]structs with non-Copyfields fail to compile. ❌ These types already complain when you try to#[derive]anything else, so… maybe that's fine? ❓
I would love to produce specialized error messages for this third case, but we can't know if the field types are all Copy within the macro invocation. (Primitives, sure, but in general…) I guess we could emit type bounds into the derived code? impl IntoCtx for Bar where i32: Copy, u32: Copy? Eh… I'm inclined to leave this unhandled and just let #[derive(Pwrite,IOwrite)] fail.