arbitrary
arbitrary copied to clipboard
Consider supporting owned `Unstructured`s
Talking with some folks who would like to pass around an Unstructured through Cranelift to implement a "chaos mode" that does semantics-preserving random mutations like shuffle basic blocks, but don't want to thread Unstructured's lifetimes through the whole code base. Would instead like to have an owned version of Unstructured that they can attach to existing context structures.
Could do this with a wrapper struct and self borrows but would like to make sure this is UB-free and in a place where it can be reused by anyone else who has similar needs:
struct OwnedUnstructured {
bytes: Vec<u8>,
// self-borrow of `bytes`; needs MIRI to tell us
// if we need `ManuallyDrop`s in here and all that
u: Unstructured<'static>,
}
impl {
pub fn u<'a>(&'a mut self) -> &'a mut Unstructured<'a> {
unsafe { mem::transmute(&mut self.u) }
}
}
// Unfortunately, `Deref[Mut]` doesn't work because we need to
// constrain the `Unstructured`'s lifetime to the `self` borrow
// but we can't do that without GATs in the `Deref` trait.
impl Deref for OwnedUnstructured {
type Target = Unstructured<'what_lifetime_to_put_here>;
// ...
}
Another approach could be to have a Cow<'a, [u8]> in Unstructured itself, although then you'd end up with an Unstructured<'static> but the way that Arbitrary is defined, this would let you create arbitrary &'static [u8]s which is not correct.
Not even sure if this is the right approach, might be better to do something like
struct OwnedUnstructured {
bytes: Vec<u8>
cursor: usize,
}
impl OwnedUnstructured {
// Get a non-owned `Unstructured` of these bytes.
pub fn u<'a>(&'a mut self) -> impl DerefMut<Target = Unstructured<'a>> {
struct U<'a> {
cursor: &'a mut usize,
initial_len: usize,
u: Unstructured<'a>
}
impl Drop for U<'_> {
fn drop(&mut self) {
// Advance cursor by number of bytes consumed. This is buggy
// because it assumes bytes are only taken from the from of
// the input, never from the back, which is not true. Don't
// know how to work around this without having a double-borrow
// of `self.bytes`.
self.cursor += self.initial_len - self.u.len();
}
}
impl Deref for U<'_> { /* ... */ }
impl DerefMut for U<'_> { /* ... */ }
let u = Unstructured::new(&self.bytes[self.cursor..]);
U {
cursor: &mut self.cursor,
initial_len: u.len(),
u
}
}
}
Just brainstorming at this point.
Thoughts? Ideas?
cc @cfallin