ouroboros icon indicating copy to clipboard operation
ouroboros copied to clipboard

Provide borrow_FIELD_mut as an unsafe method

Open SOF3 opened this issue 2 years ago • 10 comments

Is it possible to provide borrow_FIELD_mut, provided sufficient documentation on its memory safety constraints? The current documentation does not explain under what circumstances would it be unsound to return the mutable reference directly.

SOF3 avatar Jul 16 '22 15:07 SOF3

@joshua-maros Maybe you can provide a bit deeper explaination on this unsoundness? I imagine one of the problems is swapping the contents of mutable reference with something else, however maybe this can be done by returning Pin<&mut T>?

peku33 avatar Aug 06 '23 12:08 peku33

Let's say I have a field data: &'this str. The point of a function returning &mut &'this str would be to replace the inner reference with some other reference. But, it is impossible to express the'this lifetime to Rust's borrow checker. The solution employed internally by Ouroboros is to replace 'this with 'static and provide safe wrappers that abstract away this detail. If this were to be exposed, it would be easy to create a reference to dropped data:

#[self_referencing]
struct SelfRef {
    base: String,
    data: &'this str,
}
fn invalid_static_ref() -> &'static str {
    let self_ref = SelfRef::new("Drop Me".to_owned(), |base_ref| base_ref);
    *self_ref.borrow_data_mut()
}

So, maybe we should employ the trick that makes borrow_FIELD work, replacing the 'this lifetime with the lifetime of the self reference used in the method call. But since mutable references are invariant, this would also be unsound:

fn print_invalid_ref() {
    let self_ref = SelfRef::new("Unused".to_owned(), |base_ref| base_ref);
    let new_str = "This will be dropped too early".to_owned();
    'temp: {
        let data: &mut &'temp str = self_ref.borrow_data_mut();
        // Valid, new_str outlives 'temp.
        *data = &new_str;
    }
    drop(new_str);
    println!("{:?}", self_ref.get_data());
}

In general, the property of "invariance" that mutable references have dictates that any lifetimes that appear inside them must be exactly unchanged from their original values. Since Ouroboros is designed to fake a lifetime that Rust cannot express, it is impossible to return a mutable reference referring to types with the appropriate lifetimes. Given this limitation, it is my understanding that the with_mut and with_FIELD_mut methods cover all safe use cases.

someguynamedjosh avatar Aug 06 '23 14:08 someguynamedjosh

Consider code like this:

struct BorrowedStringTable<'b> {
    /// The arena to store the characters in.
    arena: &'b bumpalo::Bump,

    /// Used to have efficient lookup by index, and to provide an
    /// [Iterator] over the strings.
    vec: Vec<&'b str>,

    /// Used to have efficient lookup by [&str].
    map: HashMap<&'b str, size_t>,
}

#[self_referencing]
struct StringTableCell {
    owner: bumpalo::Bump,

    #[borrows(owner)]
    #[covariant]
    dependent: BorrowedStringTable<'this>,
}

I want to write a clear() method, which basically calls dependent.map.clear(), dependent.vec.clear(), and owner.reset(). I know it's unsafe, but it's sound, at least I think it is. I would need a &mut Bump, and there's no API to do it.

I think the issue is basically asking... can we make something available for these cases, with some documentation about what can or cannot be done?

morrisonlevi avatar Sep 25 '23 23:09 morrisonlevi

There's a much better way to do what you're asking, assuming you've defined a user-friendly constructor called new_empty:

impl StringTableCell {
    pub fn clear(&mut self) {
        *self = Self::new_empty();
    }
}

It also safeguards against adding future things and forgetting to clear them in the clear method.

someguynamedjosh avatar Sep 26 '23 00:09 someguynamedjosh

This is not viable for me*. I need to do more work than was posted here after the resets have occurred, so I can't use an empty arena. This means at least one new allocation which definitely goes against the purpose of why I'm adding a clear method in the first place.

* I mean, I could make it work. Coding is all about trade-offs. I'd just like to eat my cake and still have it too.

morrisonlevi avatar Sep 27 '23 04:09 morrisonlevi

If you need to keep the arena, you could do it with something like this (which is effectively what would be required in vanilla Rust to do something similar):

impl StringTableCell {
    pub fn cleared(self) -> Self {
        let mut arena = self.into_heads().owner;
        arena.clear();
        Self::new(arena, BorrowedStringTable::new)
    }
}

If this is still not acceptable, you would need to use unsafe code, even in the equivalent vanilla Rust scenario. If you really need what you're saying, I'm worried how the surrounding code is structured that that one allocation is a significant performance hit. It feels like being concerned about bounds checking - the vast majority of the time, it's not actually a serious problem, and trying to solve it by diving into unsafe code can end up putting your application in a worse spot than just accepting the tiny performance hit. And if the performance hit isn't tiny, there's probably something you can do to fix that.

My main concern with making a borrow_FIELD_mut method is that it lets you mess up the internals of the struct in any way that regular unsafe code can. It's effectively a switch that says "even though I'm using this library, I'd like to erase all guarantees the library provides me." You would have to keep track of as many guarantees as if you wrote your own unsafe wrapper from scratch. Providing a method connotates that there's something the library is taking care of for you, when in reality it's providing you total control over the internals of the struct. If you need total control, there's not really a point to using an existing library.

someguynamedjosh avatar Sep 28 '23 01:09 someguynamedjosh

My original motivation for borrow_FIELD_mut was to implement something similar to lock_api::ArcMutexGuard:

#[self_referencing]
struct MappedArcMutexGuard<T, U> {
    owned: Arc<Mutex<T>>,
    #[borrows(owned)]
    guard: MappedMutexGuard<'this, RawLock, U>,
}

In this case, guard is only meaningful to use when it is borrowed mutably.

It seems the only soundness issue mentioned above is when I do something like this:

let guard = MappedArcMutexGuard::new(
    arc.clone(),
    |owned| MutexGuard::map(owned.lock(), |t| U::from(t)),
);
'outer: {
    let new = some_other_mapped_mutex_guard();
    'inner: {
        let guard_inner = guard.borrow_guard_mut();
        *guard_inner = new;
    }
}
guard.borrow_guard() // dangling reference to `new`

But trying to swap the referenced object here seems really uncommon.

SOF3 avatar Sep 28 '23 05:09 SOF3

Is with_FIELD_mut not acceptable for your use case?

someguynamedjosh avatar Sep 28 '23 05:09 someguynamedjosh

The issue was created one year ago and I have refactored my code to avoid using lock_arc since then, so I can't remember the details of my original use case, but I suppose there were indeed difficulties with using the with style.

SOF3 avatar Sep 28 '23 06:09 SOF3

ArcMutexGuard implements DerefMut, which is, if I understand correctly, currently impossible to replicate with the current API (safe or unsafe). Having an unsafe (but provably sound depending on the types involved) way to reach API-parity could be beneficial.

gyscos avatar Dec 06 '23 18:12 gyscos