miri icon indicating copy to clipboard operation
miri copied to clipboard

intptrcast (with many different seeds) does not explore all possible executions

Open RalfJung opened this issue 6 years ago • 9 comments

The following assertion is currently guaranteed to pass in Miri, no matter the seed:

fn main() {
  let ptr1 = Box::into_raw(Box::new(0)) as usize;
  let ptr2 = Box::into_raw(Box::new(0)) as usize;
  assert!(ptr2 > ptr1);
}

This is because we basically have a "bump allocator" as our implementation of int-to-ptr casts. Ideally, when exploring many seeds, all possible allocation patterns should arise; but at least we could try to find a way to be a bit less predictable than we currently are, so that failing the assertion above becomes a possibility.

RalfJung avatar Jul 30 '19 10:07 RalfJung

do you want to just enable reuse, so a long running program will slowly start (deterministically) shuffling its allocations?

Or do you really want randomization? I mean your code snippet will pass when running natively ;)

oli-obk avatar Jul 30 '19 10:07 oli-obk

Reuse would be another dimension I did not even consider yet. ;)

I mean your code snippet will pass when running natively ;)

It might pass. Or you might use an allocator I could quickly hack together that makes this not pass. ;)

One way to make this particular test behave properly (as in, pass with some seeds and fail with others) is to start somewhere in the middle of the address space, and then for each new allocation randomly decide if we allocate at the "upper" or the "lower" end. But that will still not explore all possible executions, even if we assume addresses never get reused.

RalfJung avatar Jul 30 '19 19:07 RalfJung

I guess we could use https://github.com/rust-osdev/linked-list-allocator on the [u8] that is our virtual memory so we'd at least get reuse without having to do much work. We can then figure out randomization later.

oli-obk avatar Apr 23 '23 22:04 oli-obk

We could get reuse pretty easily by adjusting our integer address assignment logic, I think. Though this will incur fragmentation so it could be too expensive...

RalfJung avatar Apr 24 '23 08:04 RalfJung

What would fragment? The actual memory in the allocations is stored wherever the allocator linked into Miri decides to put it, right?

And in any case, I suspect a minority of our cycles are spent actually accessing the bytes in allocations, so even if we fragmented them I think it wouldn't be that severe an issue.

saethlin avatar Apr 24 '23 23:04 saethlin

I was thinking of Miri's virtually memory. To find an address for a new allocation, what do we do? Iterate over the list of existing allocations to see if there is a gap somewhere to fit this in?

Of course we could also go all-in: pick a random address and see if that works or overlaps. Give up after 1000 attempts or so.

RalfJung avatar Apr 26 '23 10:04 RalfJung

Yeah, I think randomization is the way to go to. We don't need to pack allocations densely like a real allocator. In addition to that, we can keep a fixed-size queue of recent deallocations and consult that first to increase our chance of reusing a recently-deallocated address.

saethlin avatar Apr 26 '23 14:04 saethlin

I think there's a much easier way to do this: Carve up the whole address space ahead of time into N different bump allocators, and serve allocations by choosing randomly between them. This would produce address space exhaustion much faster on 32-bit targets, but it should be relatively easy to implement by reusing our existing code.

saethlin avatar Aug 24 '23 18:08 saethlin

That would make things more random, so it would help, yes. Fundamentally there would still be correlations that are not guaranteed, though.

RalfJung avatar Aug 24 '23 18:08 RalfJung