rust
rust copied to clipboard
Implicit calls to `deref` and `deref_mut` are a footgun with raw pointers
I tried this code:
use std::ops::Deref;
use std::mem::MaybeUninit;
struct A{
b: MaybeUninit<B>,
}
struct B{
field: String,
}
impl Deref for A {
type Target = B;
fn deref(&self)->&B{
println!("Called deref!");
// SAFETY: Valid instances of A have b initialized.
unsafe {
self.b.assume_init_ref()
}
}
}
fn main() {
let mut a: MaybeUninit<A> = MaybeUninit::uninit();
unsafe {
let p = a.as_ptr();
// The only way to get pointer to field of pointee.
let b_ptr = &raw const (*p).b;
// However, it may cause calls to deref if we are not vigilant
// and mistakenly type name of a field of a type our pointee dereferences to.
let field_ptr = &raw const (*p).field;
}
}
I expected to see this happen: this code should be rejected because it causes implicit calls to Deref::deref which assumes *p to be initialized.
Instead, this happened: code compiles and prints Called deref! when executed.
Meta
rustc --version --verbose:
rustc 1.84.0-nightly (9322d183f 2024-10-14)
binary: rustc
commit-hash: 9322d183f45e0fd5a509820874cc5ff27744a479
commit-date: 2024-10-14
host: x86_64-unknown-linux-gnu
release: 1.84.0-nightly
LLVM version: 19.1.1
I think, it should call derefs only this way:
(**p) <-- Explicitly dereference object again.
(*).deref() <-- Explicitly call deref method.
While an unfortunate footgun, this is not a soundness issue. Removing the unsafe shows that only the first line is (correctly) exempt from needing an unsafe block, while the second line does require one.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=7545201409c69d43e6b2d9700b04aef3
It would make sense to have a lint for this still (clippy or rustc).
Might get caught by https://github.com/rust-lang/rust/pull/123239, haven't checked.
This is not unsound, so I'm gonna untag it as such.
@rustbot labels -C-bug +C-discussion
I would argue you asked for trouble by writing that Deref instance. That instance is unsound, it can be used to cause UB in safe code:
fn main() {
let a = A { b: MaybeUninit::uninit() };
let val = &a.field;
println!("{val}");
}
So, the fix is to not write such Deref instances. I am not convinced a lint is justified. (Well, ideally we'd have a lint against unsound code, but alas, that's not possible. ;)
because it causes implicit calls to Deref::deref which assumes *p to be initialized.
deref is not allowed to make such assumptions. It is a safe function, and as such must not make any assumptions beyond being given a well-typed argument.
Cc @rust-lang/opsem
@RalfJung The problem (AFAIK) is not that Rust is unsound. Rust is behaving correctly. The problem is that it's too easy to make mistakes.
Making a safe Deref is fine if the type has an invariant that it's always initialized, but private code may break this invariant temporarily, and it is too easy to invoke Deref by mistake then. Ultimately, we should be careful when writing unsafe code, and the language could assist with that.
@RalfJung
I would argue you asked for trouble by writing that Deref instance. That instance is unsound, it can be used to cause UB in safe code:
Your example is possible only inside module that declares A because field b is private. If all ways to acquire instance of A outside of module guarantee that b is initialized, the code is sound.
By your logic, Vec::deref (which calls Vec::as_slice) is unsound because I can write code like this inside alloc::vec module:
let my_vec: Vec<i32> = Vec { len: 200, buf: RawVec::new() };
let slice: &[i32] = &*my_vec;
However, we actually know that deref implementation of Vec is sound because we know that fields len and buf cannot be modified outside of alloc::vec module and that we write only valid values inside the module.
Similarly, A::deref in my example expects to be called only with valid instances. I didn't include all the code for modules and making everything private in the example because we use minimal examples for bug reports.
@ChayimFriedman2 Argues same thing as me, but more eloquently.
Yes we use minimal examples, but this was too minimal. ;) The safety invariant and abstraction barrier are a key part of the argument here. Given the example has neither and there were no comments to indicate anything like that, it is reasonable to assume there is no such invariant. (Well, there is a SAFETY comment, but it didn't explain why this invariant is supposed to hold, given that the code clearly breaks it. Vec is inside a carefully designed mod with carefully chosen pub functions. You reduced all that away which then makes it not the same situation as Vec at all.)
Thanks for clarifying!
Would be good to add a comment in main explaining that this is meant to model code inside the library and hence is allowed to temporarily break the invariant.
I added comments.