ouroboros
ouroboros copied to clipboard
Provide borrow_FIELD_mut as an unsafe method
Is it possible to provide borrow_FIELD_mut
, provided sufficient documentation on its memory safety constraints? The current documentation does not explain under what circumstances would it be unsound to return the mutable reference directly.
@joshua-maros Maybe you can provide a bit deeper explaination on this unsoundness? I imagine one of the problems is swapping the contents of mutable reference with something else, however maybe this can be done by returning Pin<&mut T>
?
Let's say I have a field data: &'this str
. The point of a function returning &mut &'this str
would be to replace the inner reference with some other reference. But, it is impossible to express the'this
lifetime to Rust's borrow checker. The solution employed internally by Ouroboros is to replace 'this
with 'static
and provide safe wrappers that abstract away this detail. If this were to be exposed, it would be easy to create a reference to dropped data:
#[self_referencing]
struct SelfRef {
base: String,
data: &'this str,
}
fn invalid_static_ref() -> &'static str {
let self_ref = SelfRef::new("Drop Me".to_owned(), |base_ref| base_ref);
*self_ref.borrow_data_mut()
}
So, maybe we should employ the trick that makes borrow_FIELD
work, replacing the 'this
lifetime with the lifetime of the self
reference used in the method call. But since mutable references are invariant, this would also be unsound:
fn print_invalid_ref() {
let self_ref = SelfRef::new("Unused".to_owned(), |base_ref| base_ref);
let new_str = "This will be dropped too early".to_owned();
'temp: {
let data: &mut &'temp str = self_ref.borrow_data_mut();
// Valid, new_str outlives 'temp.
*data = &new_str;
}
drop(new_str);
println!("{:?}", self_ref.get_data());
}
In general, the property of "invariance" that mutable references have dictates that any lifetimes that appear inside them must be exactly unchanged from their original values. Since Ouroboros is designed to fake a lifetime that Rust cannot express, it is impossible to return a mutable reference referring to types with the appropriate lifetimes. Given this limitation, it is my understanding that the with_mut
and with_FIELD_mut
methods cover all safe use cases.
Consider code like this:
struct BorrowedStringTable<'b> {
/// The arena to store the characters in.
arena: &'b bumpalo::Bump,
/// Used to have efficient lookup by index, and to provide an
/// [Iterator] over the strings.
vec: Vec<&'b str>,
/// Used to have efficient lookup by [&str].
map: HashMap<&'b str, size_t>,
}
#[self_referencing]
struct StringTableCell {
owner: bumpalo::Bump,
#[borrows(owner)]
#[covariant]
dependent: BorrowedStringTable<'this>,
}
I want to write a clear()
method, which basically calls dependent.map.clear()
, dependent.vec.clear()
, and owner.reset()
. I know it's unsafe, but it's sound, at least I think it is. I would need a &mut Bump
, and there's no API to do it.
I think the issue is basically asking... can we make something available for these cases, with some documentation about what can or cannot be done?
There's a much better way to do what you're asking, assuming you've defined a user-friendly constructor called new_empty
:
impl StringTableCell {
pub fn clear(&mut self) {
*self = Self::new_empty();
}
}
It also safeguards against adding future things and forgetting to clear them in the clear method.
This is not viable for me*. I need to do more work than was posted here after the resets have occurred, so I can't use an empty arena. This means at least one new allocation which definitely goes against the purpose of why I'm adding a clear method in the first place.
* I mean, I could make it work. Coding is all about trade-offs. I'd just like to eat my cake and still have it too.
If you need to keep the arena, you could do it with something like this (which is effectively what would be required in vanilla Rust to do something similar):
impl StringTableCell {
pub fn cleared(self) -> Self {
let mut arena = self.into_heads().owner;
arena.clear();
Self::new(arena, BorrowedStringTable::new)
}
}
If this is still not acceptable, you would need to use unsafe code, even in the equivalent vanilla Rust scenario. If you really need what you're saying, I'm worried how the surrounding code is structured that that one allocation is a significant performance hit. It feels like being concerned about bounds checking - the vast majority of the time, it's not actually a serious problem, and trying to solve it by diving into unsafe code can end up putting your application in a worse spot than just accepting the tiny performance hit. And if the performance hit isn't tiny, there's probably something you can do to fix that.
My main concern with making a borrow_FIELD_mut
method is that it lets you mess up the internals of the struct in any way that regular unsafe code can. It's effectively a switch that says "even though I'm using this library, I'd like to erase all guarantees the library provides me." You would have to keep track of as many guarantees as if you wrote your own unsafe wrapper from scratch. Providing a method connotates that there's something the library is taking care of for you, when in reality it's providing you total control over the internals of the struct. If you need total control, there's not really a point to using an existing library.
My original motivation for borrow_FIELD_mut
was to implement something similar to lock_api::ArcMutexGuard
:
#[self_referencing]
struct MappedArcMutexGuard<T, U> {
owned: Arc<Mutex<T>>,
#[borrows(owned)]
guard: MappedMutexGuard<'this, RawLock, U>,
}
In this case, guard
is only meaningful to use when it is borrowed mutably.
It seems the only soundness issue mentioned above is when I do something like this:
let guard = MappedArcMutexGuard::new(
arc.clone(),
|owned| MutexGuard::map(owned.lock(), |t| U::from(t)),
);
'outer: {
let new = some_other_mapped_mutex_guard();
'inner: {
let guard_inner = guard.borrow_guard_mut();
*guard_inner = new;
}
}
guard.borrow_guard() // dangling reference to `new`
But trying to swap the referenced object here seems really uncommon.
Is with_FIELD_mut
not acceptable for your use case?
The issue was created one year ago and I have refactored my code to avoid using lock_arc
since then, so I can't remember the details of my original use case, but I suppose there were indeed difficulties with using the with
style.
ArcMutexGuard
implements DerefMut
, which is, if I understand correctly, currently impossible to replicate with the current API (safe or unsafe). Having an unsafe (but provably sound depending on the types involved) way to reach API-parity could be beneficial.