miri icon indicating copy to clipboard operation
miri copied to clipboard

Bad diagnostic when pointer used for deallocation has insufficient Stacked Borrows provenance

Open rrichardson opened this issue 2 years ago • 5 comments
trafficstars

When running cargo +nightly miri test on my repo (linked below) I get an error for a deallocation where the tag doesn't exist.

Same result for MIRIFLAGS=-Zmiri-tag-gc=0 cargo +nightly miri test

The code is definitely unsafe, but fairly straightforward with regards to allocations and deallocations. I have verified that there is exactly 1 deallocation for the allocation on the same pointer. That said, I'm not savvy with the techniques for appeasing Miri.

Here is the error output: https://gist.github.com/rrichardson/49f0c94e0cc04491ca7481e0106eb97b

My repo: https://github.com/rrichardson/bytes/tree/refactor-api

Specific offending code: https://github.com/rrichardson/bytes/blob/refactor-api/tests/extern_buf_bytes.rs#L52

Repro Steps:

  1. Clone the repo,
  2. Switch to the refactor-api branch
  3. Run cargo +nightly miri test

rrichardson avatar Feb 08 '23 00:02 rrichardson

Just a pet peeve: It bugs me when people get output from Miri that's confusing and call it spurious. Miri is a dynamic checker with per-byte precision, it doesn't have spurious outputs in the sense that some other tools do. There are bugs in Miri and bugs in in user code, and that's about it. In my opinion we have both here (as I guessed earlier).

The (very faintly) smoking gun is here:

help: <209383> was created by a SharedReadWrite retag at offsets [0x0..0x1]
   --> tests/extern_buf_bytes.rs:52:21
    |
52  |             dealloc(self.ptr.as_mut(), layout);
    |                     ^^^^^^^^^^^^^^^^^

This retag covers a single byte, and it's from the .as_mut() call. But the mental model here is that we're deallocating the original pointer.

This code as-written calls .as_mut() at the last minute then does a reference-to-pointer coercion to loop back to the correct type, but since we've round-tripped through a reference, under Stacked Borrows we have lost provenance for all but the referent. So this pointer is indeed valid to deallocate with, but you can only deallocate one byte.

The fix is this:

diff --git a/tests/extern_buf_bytes.rs b/tests/extern_buf_bytes.rs
index 989c380..e8b24c8 100644
--- a/tests/extern_buf_bytes.rs
+++ b/tests/extern_buf_bytes.rs
@@ -49,7 +49,7 @@ impl Drop for ExternBuf {
         unsafe {
             let num = self.ptr.as_ptr() as usize;
             println!("dealloc'ing {}", num);
-            dealloc(self.ptr.as_mut(), layout);
+            dealloc(self.ptr.as_ptr(), layout);
         }
     }
 }

Ok so that's the bug in the user code. I could complain more about how we shouldn't have set up the language to encourage people to write code like this, but ultimately this pattern will probably be well-defined (but no less smelly) by the upcoming aliasing model.

I think the bug in the diagnostic is that we say this:

attempting deallocation using <209383> at alloc80643, but that tag does not exist in the borrow stack for this location

We should be able to detect this case and surface some information about the size of the allocation or the byte we got to before concluding that there is UB, like we do for all other diagnostics.

saethlin avatar Feb 08 '23 21:02 saethlin

Just a pet peeve: It bugs me when people get output from Miri that's confusing and call it spurious.

I did not mean to offend, and I certainly don't claim to have perfect code. I was just not sure how else to describe the scenario in which Miri flags otherwise functional code.

It turns out that the code was incorrect. (or not as correct as it should be) I didn't look at the docs for NonNull, and assumed that the as_mut() method returned a *mut, as opposed to a *const. Since rust-analyzer didn't complain, I assumed that it was correct.

This retag covers a single byte, and it's from the .as_mut() call. But the mental model here is that we're deallocating the original pointer. ... So this pointer is indeed valid to deallocate with, but you can only deallocate one byte.

I don't understand what the "a single byte" refers to in this case. Can you enlighten me?

rrichardson avatar Feb 08 '23 22:02 rrichardson

I was just not sure how else to describe the scenario in which Miri flags otherwise functional code.

Miri complains about a lot of code which current compilers will compile as the programmer intends. That's the great thing about tools like Miri as opposed to tools like valgrind: Miri tries to find UB that the compiler is permitted to rely on, even if doesn't yet.

Since rust-analyzer didn't complain, I assumed that it was correct.

&mut T is implicitly converted to *mut T. The fact that this conversion exists irks me, because yes exactly what you did should be the way people can write Rust code.

I don't understand what the "a single byte" refers to in this case. Can you enlighten me?

Sure! Miri currently implements a prototype aliasing model called Stacked Borrows. In this model, all references and pointers have a tag associated with them. A new tag is created when a new reference is created, or when a new pointer is created from a reference. Perhaps the biggest problem with the model is that when a reference is created, its tag only grants access to the referent. Converting a reference back to a pointer doesn't remember the original pointer. So if you have

let x: &[u8] = &[0, 1, 2, 3];
let ptr: *const u8 = &x[0] as *const u8;

Then ptr can only be used to access the first element of the array. Even though its address is the same as the pointer returned by x.as_ptr(), in the Stacked Borrows model, those pointers have different tags.

So in your code, you have a NonNull<u8>, which wraps a *mut u8. This pointer is valid for the whole allocation. But NonNull::as_mut returns a &mut u8, which is a reference to a single byte. The address is the same, but the retag which creates this reference takes you from a pointer which can be used to access this whole allocation to a reference which can only be used to access one byte. Then the &mut u8 is implicitly converted back to a *mut u8, but in Stacked Borrows the range you can access is not restored. So that code creates a *mut u8 with the same address, but which can only access one byte.

This, as I'm sure you can imagine, is pretty problematic. So there will (soon tm) be an option in Miri to use a different aliasing model which is designed to permit this pattern.


Mostly for Ralf: I'm keeping this open because I want to improve that diagnostic.

saethlin avatar Feb 09 '23 01:02 saethlin

This pointer is valid for the whole allocation. But NonNull::as_mut returns a &mut u8, which is a reference to a single byte.

Ah. Right.. that makes complete sense. "transmuting" from a &u8 to what is effectively [u8;N] might be OK with the compiler, but not Miri.

Frankly, I'm not sure if Miri should be made to allow that :) But should at least, as you point out, be able to communicate the problem better.

rrichardson avatar Feb 09 '23 01:02 rrichardson

Ah. Right.. that makes complete sense. "transmuting" from a &u8 to what is effectively [u8;N] might be OK with the compiler, but not Miri.

It might happen to be "Ok" with current versions of the compiler -- but future compiler versions might get better at optimizing code, and then the exact same code might start going wrong. The purpose of Miri is to protect you from such future surprises. :)

RalfJung avatar Feb 09 '23 10:02 RalfJung