memmap-rs icon indicating copy to clipboard operation
memmap-rs copied to clipboard

what happens if we exceed isize::MAX on 32-bit platforms?

Open oconnor663 opened this issue 7 years ago • 10 comments

The ptr::offset function, which underlies safe slice indexing, says in its docs:

If any of the following conditions are violated, the result is Undefined Behavior...The computed offset, in bytes, cannot overflow an isize...memory acquired directly from allocators or memory mapped files may be too large to handle with this function.

Since the map function is unsafe, it's arguably fine for it to expose possible UB in this way. But I think most people reading the docs won't have any idea that this is a requirement. ~Maybe it would be better for Deref to panic rather than to return a slice that's "unsoundly large"?~ (Edit: Probably just return an error if we try to mmap something larger than isize::MAX?)

oconnor663 avatar Aug 24 '18 16:08 oconnor663

@oconnor663 can you be more specific with your concern? Do you have a specific potentially buggy line of code in mind, or perhaps a reproducible example?

danburkert avatar Sep 20 '18 00:09 danburkert

I'm worried about what happens in the case where someone on a 32-bit system memory maps a 3 GB file (that is, longer than std::isize::MAX but shorter than std::usize::MAX), and then tries to read one of the bytes near the end. Here's what I believe will happen:

  • memmap checks that the file isn't too long for usize, and it'll assemble a &[u8] slice that's 3 GB long.
  • Next, the caller indexes into that slice to read a byte, somewhere near the end. Maybe just to print it or something.
  • That indexing operation translates into usize's SliceIndex::get_unchecked() implementation, which in turn calls std::ptr::add, with a usize argument that's larger than std::isize::MAX.

According to the ptr::add / ptr::offset docs I linked above, that's all it takes to trigger undefined behavior. And those docs specifically note that memory mapping is one way to get your hands on a slice large enough to hit this.

That said, I have no idea what actually happens in this case. It could be that, even though the usize gets cast into an isize and becomes negative, the result of all the pointer arithmetic still ends up being what the caller thought they were asking for? Maybe everything appears to work? Even if that's the case, I don't think we want to rely on it remaining the case.

My instinct is that MmapOptions::get_len, which is currently making sure the length doesn't overflow usize, should instead be making sure the length doesn't overflow isize. But maybe someone who knows more about why these limitations were originally put in place could give better advice.

oconnor663 avatar Sep 20 '18 05:09 oconnor663

@oconnor663 thanks for the detail. I put together #74 to try and reproduce this scenario; we'll see whether it crashes and burns on the 32bit CI targets. If you don't mind, please take a look and check if that test accurately reflects the scenario you're concerned about.

danburkert avatar Sep 20 '18 05:09 danburkert

Early indications are that test fails with ENOMEM on 32bit platforms, which seems like it might be the correct behavior.

danburkert avatar Sep 20 '18 05:09 danburkert

Hmm, maybe we should test whether a 2^31 - 2 byte mmap (just under isize::MAX) succeeds? If not, it could be that we're blowing past some per-process limit on the system.

oconnor663 avatar Sep 20 '18 05:09 oconnor663

Some more discussion here: https://www.reddit.com/r/rust/comments/9ghwuv/hey_rustaceans_got_an_easy_question_ask_here/e6fx4h2/ Relevant article: https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/

I get the feeling that these are problems we don't want to touch with a 10 foot pole :)

oconnor663 avatar Sep 22 '18 22:09 oconnor663

Update: As of https://github.com/rust-lang/rust/pull/53784, slice::from_raw_parts will panic in debug mode, if the length in bytes exceeds isize::MAX.

oconnor663 avatar Oct 01 '18 17:10 oconnor663

Note that stdlib debug asserts don’t ever run in official compiler builds. Only for like the rust-lang test suite.

Gankra avatar Oct 02 '18 03:10 Gankra

@oconnor663 The only places that use ptr::offset in the code use it either with an alignment, which never exceeds the length of a page (4069), or the offset provided to MmapOptions::offset(usize). I agree that if this offset usize overflows isize::MAX it will cause issues, but I'm not seeing any reason why the length exceeding isize::MAX would be problematic, since slice::from_raw_parts is used for creating the slice.

danburkert avatar Oct 02 '18 05:10 danburkert

Woops, sorry I completely missed the link between ptr::offset and from_raw_parts (https://github.com/rust-lang/rust/pull/53784 makes it clear).

danburkert avatar Oct 02 '18 05:10 danburkert