llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

Optimize out `sub`+`GEPi` when it's used in `ptr2int`

Open scottmcm opened this issue 1 year ago • 1 comments

It's not legal to optimize away pointer p + (q - p) in general (https://alive2.llvm.org/ce/z/ro77id), due to provenance.

However, when it's used in ptrtoint, and thus the provenance isn't relevant, it would be helpful to simplify this.

define { ptr, i64 } @slice_iter_roundtrip_via_slice(ptr noundef nonnull %0, i64 noundef %1) unnamed_addr #0 {
start:
  %ptr_addr.i = ptrtoint ptr %0 to i64
  %byte_diff.i = sub i64 %1, %ptr_addr.i
  %_2 = insertvalue { ptr, i64 } poison, ptr %0, 0
  %_11 = getelementptr inbounds i8, ptr %0, i64 %byte_diff.i
  %end_addr_or_len = ptrtoint ptr %_11 to i64
  %_3 = insertvalue { ptr, i64 } %_2, i64 %end_addr_or_len, 1
  ret { ptr, i64 } %_3
}

down to just

define { ptr, i64 } @slice_iter_roundtrip_via_slice(ptr noundef nonnull %0, i64 noundef %1) unnamed_addr #0 {
start:
  %_2 = insertvalue { ptr, i64 } poison, ptr %0, 0
  %_3 = insertvalue { ptr, i64 } %_2, i64 %1, 1
  ret { ptr, i64 } %_3
}

Alive proof: https://alive2.llvm.org/ce/z/R327wi


Context: I'd like this to improve the handling of slice iterators in Rust.

Today, this isn't a NOP in LLVM-IR (though it is in x86):

pub fn slice_iter_roundtrip_via_slice(it: std::slice::Iter<'_, i32>) -> std::slice::Iter<'_, i32> {
    it.as_slice().into_iter()
}

https://rust.godbolt.org/z/vY3eMeqr3

And while with the current implementation LLVM isn't allowed to do that, if this bug was fixed I could replace the slice::Iter implementation with something smarter about provenance that would turn it into the { ptr, i64 } version seen in this issue, where the optimization would be legal.

(The NOP roundtrip by itself isn't that important, obviously, but it's the gateway to operating on iterators via the slice, rather than needing to repeat a bunch of helpers onto the iterator type too.)

scottmcm avatar Mar 24 '24 00:03 scottmcm

Alive2: https://alive2.llvm.org/ce/z/mWS6uC

TBH it is dangerous to do optimizations related to ptrtoint/inttoptr even if alive2 agrees with them :(

dtcxzyw avatar Mar 24 '24 08:03 dtcxzyw

Makes me wish for a "strict" version of ptr2int where it's definitely not dangerous because it doesn't have exposing semantics for the underlying address. That's certainly what I want in the underlying Rust code that I'm trying to do here -- Rust has said that it'd be UB to int2ptr back from these addresses, so it's definitely safe to optimize in the specific motivating scenario.

If (int)(p + 1) can't be optimized to (int)p + sizeof(*p), though, I don't know what to do other than want a new LLVM instruction :/

scottmcm avatar Mar 25 '24 02:03 scottmcm

This should probably be a general ptrtoint(ptradd(p, o)) to add(ptrtoint(p), o) fold.

nikic avatar Mar 26 '24 01:03 nikic