[llvm] Fix __builtin_object_size interaction between Negative Offset …
…and Select/Phi
When picking a SizeOffsetAPInt through combineSizeOffset, the behavior differs if we're going to apply a constant offset that's positive or negative: If it's positive, then we need to compare the remaining bytes (i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes (i.e. Offset).
Fix #111709
@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-transforms
Author: None (serge-sans-paille)
Changes
…and Select/Phi
When picking a SizeOffsetAPInt through combineSizeOffset, the behavior differs if we're going to apply a constant offset that's positive or negative: If it's positive, then we need to compare the remaining bytes (i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes (i.e. Offset).
Fix #111709
Full diff: https://github.com/llvm/llvm-project/pull/111827.diff
3 Files Affected:
- (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+2)
- (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+45-13)
- (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+107)
diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index 7b48844cc9e8e9..01c642d4f48abd 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -160,6 +160,7 @@ struct ObjectSizeOpts {
/// though they can't be evaluated. Otherwise, null is always considered to
/// point to a 0 byte region of memory.
bool NullIsUnknownSize = false;
+
/// If set, used for more accurate evaluation
AAResults *AA = nullptr;
};
@@ -230,6 +231,7 @@ class ObjectSizeOffsetVisitor
ObjectSizeOpts Options;
unsigned IntTyBits;
APInt Zero;
+ APInt ConstantOffset;
SmallDenseMap<Instruction *, SizeOffsetAPInt, 8> SeenInsts;
unsigned InstructionsVisited;
diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp b/llvm/lib/Analysis/MemoryBuiltins.cpp
index e1abf5e4d885ec..eb6e139a6b9d6a 100644
--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -570,6 +570,14 @@ static APInt getSizeWithOverflow(const SizeOffsetAPInt &Data) {
return Size - Offset;
}
+static APInt getOffsetWithOverflow(const SizeOffsetAPInt &Data) {
+ APInt Size = Data.Size;
+ APInt Offset = Data.Offset;
+ if (Offset.isNegative())
+ return APInt(Size.getBitWidth(), 0);
+ return Offset;
+}
+
/// Compute the size of the object pointed by Ptr. Returns true and the
/// object size in Size if successful, and false otherwise.
/// If RoundToAlign is true, then Size is rounded up to the alignment of
@@ -697,7 +705,8 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value *V) {
// the index type size and if we stripped address space casts we have to
// readjust the APInt as we pass it upwards in order for the APInt to match
// the type the caller passed in.
- APInt Offset(InitialIntTyBits, 0);
+
+ APInt Offset = APInt{InitialIntTyBits, 0};
V = V->stripAndAccumulateConstantOffsets(
DL, Offset, /* AllowNonInbounds */ true, /* AllowInvariantGroup */ true);
@@ -706,7 +715,9 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value *V) {
IntTyBits = DL.getIndexTypeSizeInBits(V->getType());
Zero = APInt::getZero(IntTyBits);
+ std::swap(Offset, ConstantOffset);
SizeOffsetAPInt SOT = computeValue(V);
+ std::swap(Offset, ConstantOffset);
bool IndexTypeSizeChanged = InitialIntTyBits != IntTyBits;
if (!IndexTypeSizeChanged && Offset.isZero())
@@ -981,18 +992,39 @@ ObjectSizeOffsetVisitor::combineSizeOffset(SizeOffsetAPInt LHS,
SizeOffsetAPInt RHS) {
if (!LHS.bothKnown() || !RHS.bothKnown())
return ObjectSizeOffsetVisitor::unknown();
-
- switch (Options.EvalMode) {
- case ObjectSizeOpts::Mode::Min:
- return (getSizeWithOverflow(LHS).slt(getSizeWithOverflow(RHS))) ? LHS : RHS;
- case ObjectSizeOpts::Mode::Max:
- return (getSizeWithOverflow(LHS).sgt(getSizeWithOverflow(RHS))) ? LHS : RHS;
- case ObjectSizeOpts::Mode::ExactSizeFromOffset:
- return (getSizeWithOverflow(LHS).eq(getSizeWithOverflow(RHS)))
- ? LHS
- : ObjectSizeOffsetVisitor::unknown();
- case ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset:
- return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
+ // If the ConstantOffset we add in the end is negative, then we're actually
+ // interested in selecting the nodes based on their offset rather than their
+ // size.
+ if (ConstantOffset.isNegative()) {
+ switch (Options.EvalMode) {
+ case ObjectSizeOpts::Mode::Min:
+ return (getOffsetWithOverflow(LHS).slt(getOffsetWithOverflow(RHS))) ? LHS
+ : RHS;
+ case ObjectSizeOpts::Mode::Max:
+ return (getOffsetWithOverflow(LHS).sgt(getOffsetWithOverflow(RHS))) ? LHS
+ : RHS;
+ case ObjectSizeOpts::Mode::ExactSizeFromOffset:
+ return (getOffsetWithOverflow(LHS).eq(getOffsetWithOverflow(RHS)))
+ ? LHS
+ : ObjectSizeOffsetVisitor::unknown();
+ case ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset:
+ return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
+ }
+ } else {
+ switch (Options.EvalMode) {
+ case ObjectSizeOpts::Mode::Min:
+ return (getSizeWithOverflow(LHS).slt(getSizeWithOverflow(RHS))) ? LHS
+ : RHS;
+ case ObjectSizeOpts::Mode::Max:
+ return (getSizeWithOverflow(LHS).sgt(getSizeWithOverflow(RHS))) ? LHS
+ : RHS;
+ case ObjectSizeOpts::Mode::ExactSizeFromOffset:
+ return (getSizeWithOverflow(LHS).eq(getSizeWithOverflow(RHS)))
+ ? LHS
+ : ObjectSizeOffsetVisitor::unknown();
+ case ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset:
+ return LHS == RHS ? LHS : ObjectSizeOffsetVisitor::unknown();
+ }
}
llvm_unreachable("missing an eval mode");
}
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
index 4f4d6a88e1693b..27cbc391d52c3a 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll
@@ -117,3 +117,110 @@ if.end:
%size = call i64 @llvm.objectsize.i64.p0(ptr %p, i1 true, i1 true, i1 false)
ret i64 %size
}
+
+define i64 @pick_negative_offset(i32 %n) {
+; CHECK-LABEL: @pick_negative_offset(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[BUFFER0:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT: [[OFFSETED0:%.*]] = getelementptr i8, ptr [[BUFFER0]], i64 20
+; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-NEXT: br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK: if.else:
+; CHECK-NEXT: [[BUFFER1:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT: [[OFFSETED1:%.*]] = getelementptr i8, ptr [[BUFFER1]], i64 20
+; CHECK-NEXT: br label [[IF_END]]
+; CHECK: if.end:
+; CHECK-NEXT: [[P:%.*]] = phi ptr [ [[OFFSETED1]], [[IF_ELSE]] ], [ [[OFFSETED0]], [[ENTRY:%.*]] ]
+; CHECK-NEXT: [[POFFSETED:%.*]] = getelementptr i8, ptr [[P]], i64 -4
+; CHECK-NEXT: ret i64 4
+;
+entry:
+ %buffer0 = alloca i8, i64 20
+ %offseted0 = getelementptr i8, ptr %buffer0, i64 20
+ %cond = icmp eq i32 %n, 0
+ br i1 %cond, label %if.else, label %if.end
+
+if.else:
+ %buffer1 = alloca i8, i64 20
+ %offseted1 = getelementptr i8, ptr %buffer1, i64 20
+ br label %if.end
+
+if.end:
+ %p = phi ptr [ %offseted1, %if.else ], [ %offseted0, %entry ]
+ %poffseted = getelementptr i8, ptr %p, i64 -4
+ %size = call i64 @llvm.objectsize.i64.p0(ptr %poffseted, i1 false, i1 false, i1 false)
+ ret i64 %size
+}
+
+define i64 @pick_negative_offset_with_nullptr(i32 %n) {
+; CHECK-LABEL: @pick_negative_offset_with_nullptr(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[BUFFER0:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT: [[OFFSETED0:%.*]] = getelementptr i8, ptr [[BUFFER0]], i64 20
+; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-NEXT: br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK: if.else:
+; CHECK-NEXT: br label [[IF_END]]
+; CHECK: if.end:
+; CHECK-NEXT: [[P0:%.*]] = phi ptr [ [[OFFSETED0]], [[ENTRY:%.*]] ], [ null, [[IF_ELSE]] ]
+; CHECK-NEXT: [[P1:%.*]] = phi ptr [ null, [[IF_ELSE]] ], [ [[OFFSETED0]], [[ENTRY]] ]
+; CHECK-NEXT: [[P0OFFSETED:%.*]] = getelementptr i8, ptr [[P0]], i64 -4
+; CHECK-NEXT: [[P1OFFSETED:%.*]] = getelementptr i8, ptr [[P1]], i64 -4
+; CHECK-NEXT: [[SIZE:%.*]] = select i1 [[COND]], i64 0, i64 4
+; CHECK-NEXT: ret i64 [[SIZE]]
+;
+entry:
+ %buffer0 = alloca i8, i64 20
+ %offseted0 = getelementptr i8, ptr %buffer0, i64 20
+ %cond = icmp eq i32 %n, 0
+ br i1 %cond, label %if.else, label %if.end
+
+if.else:
+ br label %if.end
+
+if.end:
+ %p0 = phi ptr [ %offseted0, %entry ], [ null, %if.else ]
+ %p1 = phi ptr [ null, %if.else ], [ %offseted0, %entry ]
+ %p0offseted = getelementptr i8, ptr %p0, i64 -4
+ %p1offseted = getelementptr i8, ptr %p1, i64 -4
+ %size0 = call i64 @llvm.objectsize.i64.p0(ptr %p0offseted, i1 false, i1 false, i1 false)
+ %size1 = call i64 @llvm.objectsize.i64.p0(ptr %p1offseted, i1 false, i1 false, i1 false)
+ %size = select i1 %cond, i64 %size0, i64 %size1
+ ret i64 %size
+}
+
+define i64 @pick_negative_offset_with_unsized_nullptr(i32 %n) {
+; CHECK-LABEL: @pick_negative_offset_with_unsized_nullptr(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[BUFFER0:%.*]] = alloca i8, i64 20, align 1
+; CHECK-NEXT: [[OFFSETED0:%.*]] = getelementptr i8, ptr [[BUFFER0]], i64 20
+; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-NEXT: br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK: if.else:
+; CHECK-NEXT: br label [[IF_END]]
+; CHECK: if.end:
+; CHECK-NEXT: [[P0:%.*]] = phi ptr [ [[OFFSETED0]], [[ENTRY:%.*]] ], [ null, [[IF_ELSE]] ]
+; CHECK-NEXT: [[P1:%.*]] = phi ptr [ null, [[IF_ELSE]] ], [ [[OFFSETED0]], [[ENTRY]] ]
+; CHECK-NEXT: [[P0OFFSETED:%.*]] = getelementptr i8, ptr [[P0]], i64 -4
+; CHECK-NEXT: [[P1OFFSETED:%.*]] = getelementptr i8, ptr [[P1]], i64 -4
+; CHECK-NEXT: ret i64 -1
+;
+entry:
+ %buffer0 = alloca i8, i64 20
+ %offseted0 = getelementptr i8, ptr %buffer0, i64 20
+ %cond = icmp eq i32 %n, 0
+ br i1 %cond, label %if.else, label %if.end
+
+if.else:
+ br label %if.end
+
+if.end:
+ %p0 = phi ptr [ %offseted0, %entry ], [ null, %if.else ]
+ %p1 = phi ptr [ null, %if.else ], [ %offseted0, %entry ]
+ %p0offseted = getelementptr i8, ptr %p0, i64 -4
+ %p1offseted = getelementptr i8, ptr %p1, i64 -4
+ %size0 = call i64 @llvm.objectsize.i64.p0(ptr %p0offseted, i1 false, i1 true, i1 false)
+ %size1 = call i64 @llvm.objectsize.i64.p0(ptr %p1offseted, i1 false, i1 true, i1 false)
+ %size = select i1 %cond, i64 %size0, i64 %size1
+ ret i64 %size
+}
If I apply this PR locally, although I no longer see the UBSan error, I still get a number of differences in __builtin_object_size between Clang and GCC when I change my test to print __builtin_object_size's result directly. I am not sure what the status of __builtin_object_size in Clang is, are we meant to return the same results as GCC?
If I apply this PR locally, although I no longer see the UBSan error, I still get a number of differences in
__builtin_object_sizebetween Clang and GCC when I change my test to print__builtin_object_size's result directly. I am not sure what the status of__builtin_object_sizein Clang is, are we meant to return the same results as GCC?
We're allowed to have a different behavior than GCC if we produce accurate result instead of -1/0. Otherwise that's probably an error. Would you mind sharing the differences?
We're allowed to have a different behavior than GCC if we produce accurate result instead of -1/0. Otherwise that's probably an error. Would you mind sharing the differences?
Sure. For the avoidance of confusion, in general this is not specifically about negative offsets, but here is an example where we produced a different result to GCC, we still produce a different result to GCC with your PR, but no longer the same result as before.
#include <stdio.h>
int x;
int main(void) {
int array1[4] = {0};
int array2[4] = {0};
int *ptr;
if (x) {
ptr = array1 + 2;
} else {
ptr = array2 + 3;
}
printf("%zu\n", __builtin_object_size(ptr - 1, 0));
}
GCC prints 16. Clang used to print 12. Clang with your PR prints 8.
I am not sure which result is correct, I initially thought it was GCC's, but my thinking was wrong so I do not know if my conclusion was right.
I've fixed my PR, it now respects the semantic as you hinted, and I had a small bug that I got rid of.
It now prints 12 in the example you just shared. I do think it's the correct result, you can replace the ptr - 1 by ptr in your example and see that gcc and clang agree on the result (8 on my laptop, which is indeed 12 - sizeof(int))
Yeah, I think you're right that 12 is the correct result in that test. Your updated PR looks good to me at a quick glance, thanks! I'll do some additional testing tomorrow.
The PR makes it so that ObjectSizeOffsetVisitor::visitPHINode can return different results for the same PHI node when called with a different ConstantOffset, but that seems like it would interfere with the caching done in ObjectSizeOffsetEvaluator::compute_. I do not have a test case that shows wrong behaviour because of that though.
Here is another test demonstrating the PHI handling is not quite right yet, this one not related to negative offsets:
#include <stdio.h>
int x;
int main(void) {
int array1[4] = {0};
int array2[4] = {0};
int *ptr;
if (x) {
ptr = array1;
} else {
ptr = array2;
}
printf("%zu\n", __builtin_object_size(ptr, 3));
}
Regardless of whether it's array1 or array2, 16 bytes are available, so that is what I would expect to be printed.
#include <stdio.h> int x; int main(void) { int array1[4] = {0}; int array2[4] = {0}; int *ptr; if (x) { ptr = array1; } else { ptr = array2; } printf("%zu\n", __builtin_object_size(ptr, 3)); }
I'll have a look to that caching issue.
For the other example, LLVM doesn't seem to support Type=3, see
https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGBuiltin.cpp#L1122
The PR makes it so that
ObjectSizeOffsetVisitor::visitPHINodecan return different results for the same PHI node when called with a differentConstantOffset, but that seems like it would interfere with the caching done inObjectSizeOffsetEvaluator::compute_. I do not have a test case that shows wrong behaviour because of that though.
The cache belongs to the ObjectSizeOffsetEvaluator while ConstantOffset belongs to ObjectSizeOffsetVisitor. I don't see any state shared between the two so I don't see the issue (but it was worth double checking, thanks for having me do that).
Thanks for pointing me to Type = 3 just not being generally supported, and I agree with your explanation of why the caching should be a non-issue.
Unfortunately, more testing reveals that since this only works when the constant offset is known, it does not yet handle all cases, there are cases where the constant offset is not visible. With a modification to my test, I still get a false UBSan positive.
int x;
int main(void) {
int array[4] = {0};
int *ptr;
if (x) {
ptr = 0;
} else {
ptr = array + 2;
}
if (x) {
ptr = 0;
} else {
ptr = ptr + 2;
}
return ptr[-1];
}
Thanks for this test case! Bug fixed and test case added to the test suite.
I'm still a bit wary: since this relies on the offset being known, since we get incorrect results if the offset isn't known, it suggests to me that if we ever end up in ObjectSizeOffsetEvaluator::visitGEPOperator for a non-constant offset, where that then calls compute_(GEP.getPointerOperand()) where compute_ first tries to get a result via ObjectSizeOffsetVisitor, that ObjectSizeOffsetVisitor could still return an incorrect result. I cannot come up with a test case where this actually fails though.
At the same time, this PR looks to me like it makes things strictly better: if there are cases that get mishandled, they would already get mishandled even in current LLVM. So even if we end up remaining unsure that this fix is complete, it may still be fine to merge.
I would appreciate it if someone more familiar with this code could also take a look, but if no one does in a reasonable time, I think merging this should be okay, thanks.
I'm still a bit wary: since this relies on the offset being known, since we get incorrect results if the offset isn't known
Actually if we report an offset as unknown, __builtin_object_size returns an error code (either 0 or -1 depending on its arguments) and the caller can handle that gracefully, so it's not an error.
Error occur when we report an incorrect size, which is what happened in your original code, so I think we're not regressing in any way.
Actually if we report an offset as unknown,
__builtin_object_sizereturns an error code (either 0 or -1 depending on its arguments) and the caller can handle that gracefully, so it's not an error.
That is true for __builtin_object_size, but ObjectSizeOffsetEvaluator::visitGEPOperator is not used for that, it is used for __builtin_dynamic_object_size (as well as some other -fsanitize options) which does handle unknown offsets. That is the one where I worry there may still be cases not caught.
And now I do have a testcase:
#include <stdio.h>
int x, i = -1;
int main() {
int array1[4];
int array2[8];
int *ptr;
if (x) {
ptr = array1;
} else {
ptr = array2 + 4;
}
printf("%zu\n", __builtin_dynamic_object_size(ptr + i, 0));
}
This prints 0, but must print 20 (5 * sizeof(int)) or higher. It is the same problem: the PHI for ptr has incoming values that resolve to (size 16, offset 0) and (size 32, offset 16), and since both specify 16 remaining bytes, they are resolved to the former. And then, __builtin_dynamic_object_size concludes that i = -1 is out of range as an index and therefore 0 bytes are available, but that conclusion is wrong.
% cat /tmp/a.c
#include <stdio.h>
int x, i = -1;
int main() {
int array1[4];
int array2[8];
int *ptr;
if (x) {
ptr = array1;
} else {
ptr = array2 + 4;
}
printf("%zd\n", __builtin_dynamic_object_size(ptr + i, 0));
}
% ./bin/clang /tmp/a.c -o /tmp/a && /tmp/a
-1
which is totally fine (I've just changed the printf call to output signed value instead of unsigned in your example)
The return value is unsigned and printing it as signed makes it trickier to explain what values are correct: instead of 20 or greater, it's 20 or greater, or negative, where the reason negative numbers are okay is because they aren't actually negative. But anyway, that is not the reason you are getting a different output, it is because you have not enabled any optimisations. The previous test cases have been with optimisations enabled, this one is too. At any optimisation level (-Og, -O1, -O2, -O3, -Os, -Oz) it doesn't print -1, it prints 0. -1 is a valid output, 0 is not.
Thanks! Indeed I can reproduce (and the problem seems to predate my change!) and I also have a patch. I'll test it and add it to this patchset.
Sorry, but that looks like a very wrong approach to me. The code you're modifying in your last commit was already correct, and this new version means we no longer detect accesses before the start of an object as out of bounds.
#include <stdio.h>
int i = -1;
int main() {
int array[4];
printf("%zu\n", __builtin_dynamic_object_size(array + i, 0));
}
Here, we used to print 0, and 0 is the best possible value to print. Your update makes it so that we print 20.
When you've got an index that you know is before the start of an object, you know it's invalid to access any bytes, that's what the code there was correctly handling. The problem isn't that check, it's that the PHI handling wrongly concludes that we know we're at the start of an object when we're not, and that is the bit that needs to be fixed.
I now have a better understanding, and indeed you're right. It's actually exactly the same situation as the one in non-dynamic case, but because the offset is negative and dynamic, the choice is incorrect. Which makes me think that in both cases, if we have the choice between two object with the same size + offset value, then choosing the one with the larger offset solves the issue. I'll check that.
Which makes me think that in both cases, if we have the choice between two object with the same size + offset value, then choosing the one with the larger offset solves the issue. I'll check that.
That works for this test, though for __builtin_dynamic_object_size(ptr + index, 2) it might cause a too large value to be returned.
I think we need to, at least in the __builtin_dynamic_object_size case, use ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset in which case we can return that we do not statically know the size of the object. That way, we end up in ObjectSizeOffsetEvaluator::visitPHINode and can handle each incoming value separately and return the correct value.
Just pushed a variant that makes a more educated pick between the two combined offset in case of equality. It seems to me it passes all the test you've been crafting so far ^^!
The problem is not limited to the case where remaining object sizes are equal though, it's just that those are easiest to come up with test cases for. Consider:
#include <stdio.h>
int x;
int main(void) {
int array1[4];
int array2[8];
int *ptr;
if (x) {
printf("hello\n");
ptr = array1 + 1;
} else {
ptr = array2 + 6;
}
printf("%zu\n", __builtin_object_size(ptr - 2, 0));
}
This must print 16 or higher, but prints 0 (when optimisations are enabled). During InstCombine __builtin_object_size cannot be resolved yet, so instead it is saved for LowerConstantIntrinsics. During LowerConstantIntrinsics, it evaluates with ObjectSizeOpts::Mode::Max, sees incoming PHI values {16, 4} and {32, 24}. The former says 12 bytes are available, the latter 8, so it picks the former. Then it applies the negative offset of -8 bytes to it, and because it thinks it knows it's 4 bytes past the start of an object, it wrongly concludes the pointer arithmetic produces an out of bounds value.
The problem is not limited to the case where remaining object sizes are equal though, it's just that those are easiest to come up with test cases for. Consider:
#include <stdio.h> int x; int main(void) { int array1[4]; int array2[8]; int *ptr; if (x) { printf("hello\n"); ptr = array1 + 1; } else { ptr = array2 + 6; } printf("%zu\n", __builtin_object_size(ptr - 2, 0)); }This must print 16 or higher, but prints 0 (when optimisations are enabled). During InstCombine
__builtin_object_sizecannot be resolved yet, so instead it is saved for LowerConstantIntrinsics. During LowerConstantIntrinsics, it evaluates withObjectSizeOpts::Mode::Max, sees incoming PHI values {16, 4} and {32, 24}. The former says 12 bytes are available, the latter 8, so it picks the former. Then it applies the negative offset of -8 bytes to it, and because it thinks it knows it's 4 bytes past the start of an object, it wrongly concludes the pointer arithmetic produces an out of bounds value.
Indeed. Looks like my previous approach which propagates the constant (negative) offset would have worked in that situation. But it couldn't work with a dynamic negative offset! Looks like we should be able to pick different bounds based on the offset sign, I'll explore that path.
In any case, thanks a lot for your patience regarding this issue. :bow:
Indeed. Looks like my previous approach which propagates the constant (negative) offset would have worked in that situation. But it couldn't work with a dynamic negative offset!
You're right, that would handle that one. Can I suggest an alternative though?
Right now, we act on SizeOffsetAPInt objects which contain a size (optional) and offset (optional), and the problem with the PHI handling is that we intentionally risk returning an incorrect offset and an incorrect size, on the assumption that we know how the result will be used and the two errors will cancel out. It turns out they do not cancel out in all cases because we do not have a full understanding of how they are used.
What if, instead, we track three APInts? What if we track TotalSize, Offset, RemainingSize?
For my original test, we have PHIs where the incoming values resolve to SizeOffsetAPInts of {0, 0} and {16, 16}. With my idea, they would instead resolve to {0, 0, 0} and {16, 16, 0}. This can then be correctly combined into {unknown, unknown, 0}, which gives us enough information to say that if a subsequently applied offset is negative, we cannot draw any conclusions. At the same time, this fully preserves the RemainingSize to allow the currently intended optimisations, which assume positive offsets, to continue to be performed: for any positive offset, only RemainingSize needs to be looked at.
It does not quite handle all the cases that your ConstantOffset approach handles, but it would, I believe, be easier to understand and reason to be correct, it would be easier to assure ourselves we are not missing any cases.
@hvdijk I went from your idea, but using only two values: one to track the amount of memory available before current point, and one to track the amount of memory available after current point. It is indeed easy to reason about, as in your proposal, and behaves in a very elegant manner wrt. the combine operation. And it passes all our examples \o/ Please let me know what you think of this.
Nice work! Using a new class rather than extending SizeOffsetAPInt is probably good, it helps maintain compatibility, which is a plus if it means we can also backport this to LLVM 19 to fix the wrong UBSAN errors there.
The change to llvm/test/Transforms/InstCombine/builtin-object-size-offset.ll should not be needed. With -passes=instcombine, it is supposed to check for the size to be known exactly. Here, the size was previously known exactly and it is a regression that we no longer pick that up. The reason this is failing is because in ObjectSizeOffsetVisitor::compute, you have if (!Span.bothKnown()) return {};, but if EvalMode != ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset, that is not the best way of handling it. For ExactSizeFromOffset specifically, the idea is that Before does not matter to the caller, you can construct the result from After alone. By requiring both Before and After to be known, you are effectively making it so that ExactSizeFromOffset and ExactUnderlyingSizeAndOffset do the same thing, which they should not.
However, if ExactSizeFromOffset is restored to handle that case, it then becomes important to ensure that nothing calls ObjectSizeOffsetVisitor::compute with Options.EvalMode = ObjectSizeOpts::Mode::ExactSizeFromOffset if actually it does care about the offset. Yet ObjectSizeOffsetEvaluator::compute_ does exactly that. Fixing that is possible by making sure ObjectSizeOffsetVisitor has some public compute version that returns OffsetSpan, and making ObjectSizeOffsetEvaluator::compute_ use that as well. Alternatively, it may be possible to get ObjectSizeOffsetEvaluator::compute_ to use a different EvalMode, depending on how exactly ObjectSizeOffsetVisitor::compute gets changed.
P.S.: The reason why I suggested keeping TotalSize is that this case can give an upper bound when we know we are indexing into a specific object, but do not know the index (e.g. __builtin_object_size(array + i, 0) where i is only known at runtime can still be bound by sizeof array). I thought we already handled that, but we don't, so it is fine that your PR leaves that unhandled.
I will do more extensive testing later.
patch updated with your suggestion to adjust the 'before' field when it actually doesn't matter. Works like a charm!
I did write in my previous comment:
However, if
ExactSizeFromOffsetis restored to handle that case, it then becomes important to ensure that nothing callsObjectSizeOffsetVisitor::computewithOptions.EvalMode = ObjectSizeOpts::Mode::ExactSizeFromOffsetif actually it does care about the offset. YetObjectSizeOffsetEvaluator::compute_does exactly that.
The result is that this latest iteration does not handle my test case from https://github.com/llvm/llvm-project/pull/111827#issuecomment-2419260982: that now prints 0 again, but should not.
The possible approach I mentioned in my previous comment to use a different EvalMode is simple to do:
--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -1083,7 +1083,9 @@ SizeOffsetValue ObjectSizeOffsetEvaluator::compute(Value *V) {
}
SizeOffsetValue ObjectSizeOffsetEvaluator::compute_(Value *V) {
- ObjectSizeOffsetVisitor Visitor(DL, TLI, Context, EvalOpts);
+ ObjectSizeOpts VisitorEvalOpts(EvalOpts);
+ VisitorEvalOpts.EvalMode = ObjectSizeOpts::Mode::ExactUnderlyingSizeAndOffset;
+ ObjectSizeOffsetVisitor Visitor(DL, TLI, Context, VisitorEvalOpts);
SizeOffsetAPInt Const = Visitor.compute(V);
if (Const.bothKnown())
return SizeOffsetValue(ConstantInt::get(Context, Const.Size),
This does also need builtin-object-size-phi.ll to be updated, but I think the effect that it has on that is correct, I think that adequately tests that the bug is fixed.
I've applied your patch with a few extra comments in the diff and in the commit.