miri
miri copied to clipboard
Can/should miri ensure that the addresses of consts/vtables/... are unstable?
In the community Discord, someone recently found a bug in their code (or idea?) because the address of a const value wasn't stable in miri, but was stable in normal execution.
As demonstration, someone posted this example:
const FOO: &str = "";
fn main() {
println!("{:p}", FOO);
println!("{:p}", FOO);
}
Which on the playground currently prints
0x25ea8
0x2d638
But then OP responded with this example:
const FOO: &&str = &"";
fn main() {
println!("{:p}", *FOO);
println!("{:p}", *FOO);
println!("{:p}", *FOO);
}
Which on the playground currently prints
0x25e96
0x25e96
0x25e96
This seems less-than-awesome. Is there something that miri can do to make sure that addresses of consts aren't accidentally stable across instantiations? Or at least, are very unlikely to be stable?
The code that was found was https://github.com/LightningCreations/lccc/blob/6377fff74e4de2b98eac1ea78db60baf87d75cbb/xlang/xlang_abi/src/string.rs#L380 (in my project, but not my code directly, credits in the comment).
I'd note that it seems like a potentially huge footgun that rust does not guarantee this, especially since, coming from C++, I wouldn't expect const
s to act like basically typechecked macros (although, I'd admit this isn't the place for that discussion - and IRLO thread, or a message in the t-lang or wg-unsafe-code-guidelines stream in zulip would be better).
This seems related to https://github.com/rust-lang/miri/issues/131.
I'd note that it seems like a potentially huge footgun that rust does not guarantee this, especially since, coming from C++, I wouldn't expect consts to act like basically typechecked macros (although, I'd admit this isn't the place for that discussion - and IRLO thread, or a message in the t-lang or wg-unsafe-code-guidelines stream in zulip would be better).
This is somewhat documented at https://github.com/rust-lang/const-eval/blob/master/const.md, but I agree it is subtle. The upshot is that if you care about address identity, you have to use static
.
The upshot is that if you care about address identity, you have to use static.
Sometimes you can't do that, though. My code was a macro intended to be used for intializating values of type StringView<'static>
(a type within xlang abi that provides an abi-stable version of &str
) in constant contexts, including for consts
and inside const fn
s, which a static
cannot be used to achieve. The version of the macro linked above (which was previously necessary, as it was compatible with rust 1.39, a previous MSRV), suffered from this bug, which I couldn't solve until it was pointed out that the uses of __STR
wasn't guaranteed to produce the same value, because I had expected it to do so.
Are you saying you want the language changed to accommodate your implementation? I'm very confused
The upshot is that if you care about address identity, you have to use static.
If the const is generic it is even impossible to guarantee address identity as two crates may instantiate it with the same address. Also note that even functions and vtables don't have a guaranteed identity. Vtables are always duplicated per cgu and functions may be duplicated and we tell LLVM that we don't care about address identity for functions.
For the record, this is how we handle having a constant of "significant address" in rustc
, via static
.
It's not pretty, and it does require limiting the alignment since the static
can only guarantee so much.
But we can't really guarantee much more without limiting platform support.
I kept going back and forth given anecdotal evidence from people with various levels of experience/knowledge about linker-based deduplication, but what finally convinced me is libcxx
's RTTI (typeinfo
) implementation.
That is, being able to guarantee linker-based deduplication 100% of the time (e.g. by encoding the data in the symbol, though we'd have to use hashes for really large ones, and then having the linker deduplicate by symbol) is isomorphic to libcxx
being able to use the Unique
RTTI implementation on all platforms Rust supports.
This is also, FWIW, the same reason we don't offer anything resembling generic/associated static
s.
It would be nice if someone could fix all the tooling out there to handle it for both build time and runtime (i.e. guaranteeing deduplication for dynamic loading, through dynamic linkers), but until then, we can't.
Right, so this issue is basically the opposite -- in Miri, each const and vtable has a unique address, and it is non-trivial to change that. ;) But we probably do want to change it to get more test coverage.
Right, so this issue is basically the opposite -- in Miri, each const and vtable has a unique address, and it is non-trivial to change that.
Oh right, I did misunderstand the original issue description as saying the opposite, and just assumed a significant refactor had already happened.
Given we don't want to guarantee either ==
or !=
, one potential (opt-in?) approach here is to make every e.g. Operand::Const
use the RNG to decide whether to reuse an existing address (and if so which?) - theoretically this could even recurse and keep RNG for every reference to some non-static
constant data.
(or to sound less sill: make every use of a constant reference have non-deterministic ==
/!=
with any other references to the same data, and then resolve that non-determinism with the RNG)
Yeah, something like that. Though the nondet needs to happen when the &
is evaluated, not when the ==
is evaluated. Or are you saying the following function might sometimes return false
?
fn check<T>(x: *const T, y: *const T) -> bool {
(x == y) == (x == y)
}
Though the nondet needs to happen when the
&
is evaluated, not when the==
is evaluated.
I agree, I was just struggling to describe it precisely enough (that's why I mentioned Operand::Const
fwiw).
In the original testing, whether you get a unique address or not depends on what level you evaluate the const to. In the case of an const FOO: &str;
you get unique addresses each time you evaluate it, in the case of a const FOO: &&str;
you get the same address every time you evaluate *FOO
, but this is not guaranteed (and the original idea in our discussion was that you shouldn't get the same address, to make testing code that assumes it gives the same address easier).
FWIW, right now Miri makes the address of FOO
unique inside a function invocation, but potentially non-unique across different functions or different instantiations of the same function. I am not sure if that is the right choice, maybe it should be even less unique?
https://github.com/rust-lang/miri/issues/1955 will make vtables and everything else non-unique for each time they get turned into a pointer, so consts are the only things where we still have some kind of uniqueness. (We certainly don't guarantee that kind of per-fn-invocation uniqueness today though, this is just a Miri implementation choice.)
TBH, I'd prefer that const
items that contain references/pointers have unique address for those items, IE. a const
item should always evaluate to an identical expression*, not just an equal one.
I was bitten by the lack of uniqueness when writing a macro producing a (start,end)
pair of pointers from a string literal at compile time (the macro itself has been changed, but the original issue was quite annoying).
*Associated const
s in traits and generic inherent impl
blocks pose an issue for this, but free constants and associated constants in non-generic impl
blocks do not.
Unfortunately guaranteeing uniqueness requires a deduplicating linker, which AFAIK is not available in all platforms (Windows?) and all configurations (dynamic linking?).
Anyway, if we want to discuss guarantees we should move that to a UCG issue. (Sorry for leading this issue down the wrong track.)
Right, I got mixed up because I don't usually get notifs from miri issues :sweat_smile:
Here's the UCG issue to discuss what we should guarantee if we want to guarantee more than nothing: https://github.com/rust-lang/unsafe-code-guidelines/issues/522.
EDIT: Ah, you just found it. :)
While https://github.com/rust-lang/rust/pull/128742 does overall make things less stable, it also makes it so that when a function pointer is created in CTFE that now always produces the same result, so those function pointers are now actually more stable. Creating a full new allocation for each CTFE-created function pointer does seem excessive though (that's what we did before that PR) and using an RNG is tricky given the compiler's determinism requirements, so... not sure what the best solution here is.