miri icon indicating copy to clipboard operation
miri copied to clipboard

Can/should miri ensure that the addresses of consts/vtables/... are unstable?

Open saethlin opened this issue 3 years ago • 17 comments

In the community Discord, someone recently found a bug in their code (or idea?) because the address of a const value wasn't stable in miri, but was stable in normal execution.

As demonstration, someone posted this example:

const FOO: &str = "";
fn main() {
    println!("{:p}", FOO);
    println!("{:p}", FOO);
}

Which on the playground currently prints

0x25ea8
0x2d638

But then OP responded with this example:

const FOO: &&str = &"";
fn main() {
    println!("{:p}", *FOO);
    println!("{:p}", *FOO);
    println!("{:p}", *FOO);
}

Which on the playground currently prints

0x25e96
0x25e96
0x25e96

This seems less-than-awesome. Is there something that miri can do to make sure that addresses of consts aren't accidentally stable across instantiations? Or at least, are very unlikely to be stable?

saethlin avatar Jan 04 '22 00:01 saethlin

The code that was found was https://github.com/LightningCreations/lccc/blob/6377fff74e4de2b98eac1ea78db60baf87d75cbb/xlang/xlang_abi/src/string.rs#L380 (in my project, but not my code directly, credits in the comment).

I'd note that it seems like a potentially huge footgun that rust does not guarantee this, especially since, coming from C++, I wouldn't expect consts to act like basically typechecked macros (although, I'd admit this isn't the place for that discussion - and IRLO thread, or a message in the t-lang or wg-unsafe-code-guidelines stream in zulip would be better).

chorman0773 avatar Jan 04 '22 01:01 chorman0773

This seems related to https://github.com/rust-lang/miri/issues/131.

I'd note that it seems like a potentially huge footgun that rust does not guarantee this, especially since, coming from C++, I wouldn't expect consts to act like basically typechecked macros (although, I'd admit this isn't the place for that discussion - and IRLO thread, or a message in the t-lang or wg-unsafe-code-guidelines stream in zulip would be better).

This is somewhat documented at https://github.com/rust-lang/const-eval/blob/master/const.md, but I agree it is subtle. The upshot is that if you care about address identity, you have to use static.

RalfJung avatar Jan 04 '22 09:01 RalfJung

The upshot is that if you care about address identity, you have to use static.

Sometimes you can't do that, though. My code was a macro intended to be used for intializating values of type StringView<'static> (a type within xlang abi that provides an abi-stable version of &str) in constant contexts, including for consts and inside const fns, which a static cannot be used to achieve. The version of the macro linked above (which was previously necessary, as it was compatible with rust 1.39, a previous MSRV), suffered from this bug, which I couldn't solve until it was pointed out that the uses of __STR wasn't guaranteed to produce the same value, because I had expected it to do so.

chorman0773 avatar Jan 04 '22 22:01 chorman0773

Are you saying you want the language changed to accommodate your implementation? I'm very confused

saethlin avatar Jan 04 '22 23:01 saethlin

The upshot is that if you care about address identity, you have to use static.

If the const is generic it is even impossible to guarantee address identity as two crates may instantiate it with the same address. Also note that even functions and vtables don't have a guaranteed identity. Vtables are always duplicated per cgu and functions may be duplicated and we tell LLVM that we don't care about address identity for functions.

bjorn3 avatar Jan 05 '22 08:01 bjorn3

For the record, this is how we handle having a constant of "significant address" in rustc, via static. It's not pretty, and it does require limiting the alignment since the static can only guarantee so much.

But we can't really guarantee much more without limiting platform support. I kept going back and forth given anecdotal evidence from people with various levels of experience/knowledge about linker-based deduplication, but what finally convinced me is libcxx's RTTI (typeinfo) implementation.

That is, being able to guarantee linker-based deduplication 100% of the time (e.g. by encoding the data in the symbol, though we'd have to use hashes for really large ones, and then having the linker deduplicate by symbol) is isomorphic to libcxx being able to use the Unique RTTI implementation on all platforms Rust supports.

This is also, FWIW, the same reason we don't offer anything resembling generic/associated statics. It would be nice if someone could fix all the tooling out there to handle it for both build time and runtime (i.e. guaranteeing deduplication for dynamic loading, through dynamic linkers), but until then, we can't.

eddyb avatar Jul 22 '22 14:07 eddyb

Right, so this issue is basically the opposite -- in Miri, each const and vtable has a unique address, and it is non-trivial to change that. ;) But we probably do want to change it to get more test coverage.

RalfJung avatar Jul 22 '22 15:07 RalfJung

Right, so this issue is basically the opposite -- in Miri, each const and vtable has a unique address, and it is non-trivial to change that.

Oh right, I did misunderstand the original issue description as saying the opposite, and just assumed a significant refactor had already happened.


Given we don't want to guarantee either == or !=, one potential (opt-in?) approach here is to make every e.g. Operand::Const use the RNG to decide whether to reuse an existing address (and if so which?) - theoretically this could even recurse and keep RNG for every reference to some non-static constant data.

(or to sound less sill: make every use of a constant reference have non-deterministic ==/!= with any other references to the same data, and then resolve that non-determinism with the RNG)

eddyb avatar Jul 22 '22 16:07 eddyb

Yeah, something like that. Though the nondet needs to happen when the & is evaluated, not when the == is evaluated. Or are you saying the following function might sometimes return false?

fn check<T>(x: *const T, y: *const T) -> bool {
  (x == y) == (x == y)
}

RalfJung avatar Jul 22 '22 16:07 RalfJung

Though the nondet needs to happen when the & is evaluated, not when the == is evaluated.

I agree, I was just struggling to describe it precisely enough (that's why I mentioned Operand::Const fwiw).

eddyb avatar Jul 22 '22 16:07 eddyb

In the original testing, whether you get a unique address or not depends on what level you evaluate the const to. In the case of an const FOO: &str; you get unique addresses each time you evaluate it, in the case of a const FOO: &&str; you get the same address every time you evaluate *FOO, but this is not guaranteed (and the original idea in our discussion was that you shouldn't get the same address, to make testing code that assumes it gives the same address easier).

Nemo157 avatar Jul 22 '22 16:07 Nemo157

FWIW, right now Miri makes the address of FOO unique inside a function invocation, but potentially non-unique across different functions or different instantiations of the same function. I am not sure if that is the right choice, maybe it should be even less unique?

https://github.com/rust-lang/miri/issues/1955 will make vtables and everything else non-unique for each time they get turned into a pointer, so consts are the only things where we still have some kind of uniqueness. (We certainly don't guarantee that kind of per-fn-invocation uniqueness today though, this is just a Miri implementation choice.)

RalfJung avatar Aug 06 '24 17:08 RalfJung

TBH, I'd prefer that const items that contain references/pointers have unique address for those items, IE. a const item should always evaluate to an identical expression*, not just an equal one.

I was bitten by the lack of uniqueness when writing a macro producing a (start,end) pair of pointers from a string literal at compile time (the macro itself has been changed, but the original issue was quite annoying).

*Associated consts in traits and generic inherent impl blocks pose an issue for this, but free constants and associated constants in non-generic impl blocks do not.

chorman0773 avatar Aug 06 '24 17:08 chorman0773

Unfortunately guaranteeing uniqueness requires a deduplicating linker, which AFAIK is not available in all platforms (Windows?) and all configurations (dynamic linking?).

Anyway, if we want to discuss guarantees we should move that to a UCG issue. (Sorry for leading this issue down the wrong track.)

RalfJung avatar Aug 06 '24 17:08 RalfJung

Right, I got mixed up because I don't usually get notifs from miri issues :sweat_smile:

chorman0773 avatar Aug 06 '24 18:08 chorman0773

Here's the UCG issue to discuss what we should guarantee if we want to guarantee more than nothing: https://github.com/rust-lang/unsafe-code-guidelines/issues/522.

EDIT: Ah, you just found it. :)

RalfJung avatar Aug 06 '24 18:08 RalfJung

While https://github.com/rust-lang/rust/pull/128742 does overall make things less stable, it also makes it so that when a function pointer is created in CTFE that now always produces the same result, so those function pointers are now actually more stable. Creating a full new allocation for each CTFE-created function pointer does seem excessive though (that's what we did before that PR) and using an RNG is tricky given the compiler's determinism requirements, so... not sure what the best solution here is.

RalfJung avatar Aug 09 '24 16:08 RalfJung