[SimplifyCFG] Transform switch to select when common bits uniquely identify one case
My attempt to fix #141753 .
This patch introduces a new check, that tries to decide if the conjunction of all the values uniquely identify the accepted values by the switch.
I am also open to different solution. If you think this approach is not the best and have a different idea I would also be happy to implement that.
I was also thinking about addressing this pattern in inst combine. There was a previous commit, that adresses a very similar pattern to this https://github.com/llvm/llvm-project/commit/902acde34198bb11cc758dcf3aee00eb1cb09ceb, but I think it would be better if we wouldn't even get to the point of emitting this pattern.
@llvm/pr-subscribers-llvm-transforms
Author: Gábor Spaits (spaits)
Changes
My attempt to fix #141753 .
This patch introduces a new check, that tries to decide if the conjunction of all the values uniquely identify the accepted values by the switch.
I am also open to different solution. If you think this approach is not the best and have a different idea I would also be happy to implement that.
I was also thinking about addressing this pattern in inst combine. There was a previous commit, that adresses a very similar pattern to this https://github.com/llvm/llvm-project/commit/902acde34198bb11cc758dcf3aee00eb1cb09ceb, but I think it would be better if we wouldn't even get to the point of emitting this pattern.
Full diff: https://github.com/llvm/llvm-project/pull/145233.diff
2 Files Affected:
- (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+29-2)
- (modified) llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll (+88)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index e221022bb8361..24d43b63dea41 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -6290,10 +6290,37 @@ static Value *foldSwitchToSelect(const SwitchCaseResultVectorTy &ResultVector,
// case 0,2,8,10 -> Cond & 0b1..0101 == 0 ? result : default
if (isPowerOf2_32(CaseCount)) {
ConstantInt *MinCaseVal = CaseValues[0];
- // Find mininal value.
- for (auto *Case : CaseValues)
+ // In case, there are bits, that can only be present in the CaseValues we
+ // can transform the switch into a select if the conjunction of
+ // all the values uniquely identify the CaseValues.
+ APInt AndMask = APInt::getAllOnes(MinCaseVal->getBitWidth());
+
+ for (auto *Case : CaseValues) {
if (Case->getValue().slt(MinCaseVal->getValue()))
MinCaseVal = Case;
+ AndMask &= Case->getValue();
+ }
+
+ ConstantRange CR = computeConstantRange(
+ Condition, /* ForSigned */ Condition->getType()->isSingleValueType());
+ unsigned int ConditionWidth = Condition->getType()->getIntegerBitWidth();
+ APInt ActiveBits = APInt(ConditionWidth, CR.getActiveBits(),
+ Condition->getType()->isSingleValueType());
+
+ APInt One(ConditionWidth, 1, false);
+ APInt ActiveBitsMask = (One << ActiveBits) - 1;
+
+ // To make sure, that the representation of the accepted values is
+ // actually unique we check, wheter the conjucted bits and the another
+ // conjuction with the input value will only be true for exactly CaseCount
+ // number times.
+ if ((One << ActiveBits) - (One << (ActiveBits - AndMask.popcount())) ==
+ CaseCount) {
+ Value *And = Builder.CreateAnd(Condition, AndMask);
+ Value *Cmp =
+ Builder.CreateICmpNE(And, Constant::getNullValue(And->getType()));
+ return Builder.CreateSelect(Cmp, ResultVector[0].first, DefaultResult);
+ }
// Mark the bits case number touched.
APInt BitMask = APInt::getZero(MinCaseVal->getBitWidth());
diff --git a/llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll b/llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
index 50998e447b71d..638bda990bc87 100644
--- a/llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
+++ b/llvm/test/Transforms/SimplifyCFG/switch-to-select-two-case.ll
@@ -309,3 +309,91 @@ end:
%t0 = phi i8 [ 42, %case1 ], [ 42, %case2 ], [ 44, %case3 ], [ 44, %case4 ]
ret i8 %t0
}
+
+define i1 @range0to4odd(i8 range(i8 0, 4) %f) {
+; CHECK-LABEL: @range0to4odd(
+; CHECK-NEXT: bb3:
+; CHECK-NEXT: [[TMP0:%.*]] = and i8 [[F:%.*]], 1
+; CHECK-NEXT: [[TMP1:%.*]] = icmp ne i8 [[TMP0]], 0
+; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i1 true, i1 false
+; CHECK-NEXT: ret i1 [[TMP2]]
+;
+ switch i8 %f, label %bb1 [
+ i8 1, label %bb2
+ i8 3, label %bb2
+ ]
+bb1:
+ br label %bb3
+bb2:
+ br label %bb3
+bb3:
+ %_0.sroa.0.0 = phi i1 [ false, %bb1 ], [ true, %bb2 ]
+ ret i1 %_0.sroa.0.0
+}
+
+define i1 @range1to4odd(i8 range(i8 1, 4) %f) {
+; CHECK-LABEL: @range1to4odd(
+; CHECK-NEXT: bb3:
+; CHECK-NEXT: [[TMP0:%.*]] = and i8 [[F:%.*]], 1
+; CHECK-NEXT: [[TMP1:%.*]] = icmp ne i8 [[TMP0]], 0
+; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i1 true, i1 false
+; CHECK-NEXT: ret i1 [[TMP2]]
+;
+ switch i8 %f, label %bb1 [
+ i8 1, label %bb2
+ i8 3, label %bb2
+ ]
+bb1:
+ br label %bb3
+bb2:
+ br label %bb3
+bb3:
+ %_0.sroa.0.0 = phi i1 [ false, %bb1 ], [ true, %bb2 ]
+ ret i1 %_0.sroa.0.0
+}
+
+define i1 @range0to8odd(i8 range(i8 0, 8) %f) {
+; CHECK-LABEL: @range0to8odd(
+; CHECK-NEXT: bb3:
+; CHECK-NEXT: [[TMP0:%.*]] = and i8 [[F:%.*]], 1
+; CHECK-NEXT: [[TMP1:%.*]] = icmp ne i8 [[TMP0]], 0
+; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i1 true, i1 false
+; CHECK-NEXT: ret i1 [[TMP2]]
+;
+ switch i8 %f, label %bb1 [
+ i8 1, label %bb2
+ i8 3, label %bb2
+ i8 5, label %bb2
+ i8 7, label %bb2
+ ]
+bb1:
+ br label %bb3
+bb2:
+ br label %bb3
+bb3:
+ %_0.sroa.0.0 = phi i1 [ false, %bb1 ], [ true, %bb2 ]
+ ret i1 %_0.sroa.0.0
+}
+
+define i1 @negative_range0to5even(i8 range(i8 0, 5) %f) {
+; CHECK-LABEL: @negative_range0to5even(
+; CHECK-NEXT: bb3:
+; CHECK-NEXT: [[TMP0:%.*]] = sub i8 [[F:%.*]], 2
+; CHECK-NEXT: [[SWITCH_AND:%.*]] = and i8 [[TMP0]], -3
+; CHECK-NEXT: [[SWITCH_SELECTCMP:%.*]] = icmp eq i8 [[SWITCH_AND]], 0
+; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[SWITCH_SELECTCMP]], i1 true, i1 false
+; CHECK-NEXT: ret i1 [[TMP1]]
+;
+ switch i8 %f, label %bb1 [
+ i8 2, label %bb2
+ i8 4, label %bb2
+ ]
+bb1:
+ br label %bb3
+bb2:
+ br label %bb3
+bb3:
+ %_0.sroa.0.0 = phi i1 [ false, %bb1 ], [ true, %bb2 ]
+ ret i1 %_0.sroa.0.0
+}
+
Thank you @antoniofrighetto @dtcxzyw . I think I have addressed everything. I have replaced computeConstantRange with computeKnownBits and removed the unused ActiveBitsMask.
Sqashed the commits and re-run the CI (Some lldb tests have randomly failed due to possible env issues).
Gentle ping @antoniofrighetto @dtcxzyw @nikic . Sorry for pinging you. Could you please provide some input? Do you think I am addressing the issue in the correct way? (Is it appropiate to solve it in SimplifyCFG or a later InstCombine would be better?)
@dtcxzyw Thank you very much for reviewing my PR.
- I have rebased the PR to main
- ActiveBits is now unsigned
- Replaced the old
icmp ne (and %1 AndMask) 0withcmp eq and(%1, AndMask) AndMask. You have pointed out, that the old method could only identifyselects where there was one single bit common in every pattern. The new method can use multiple bits to identify the pattern. A good example of such a case, where the new method works but the old doesn't is testrange0to15_middle_two_bits. I have also added some negative tests too.
Thank you very much for reviewing and pointing out the inefficiency in my original logic. I would like to ask you to please re-review the PR.
@antoniofrighetto Thank you very much for reviewing my PR.
Thank you very much for your review @dtcxzyw . I will merge the PR.
It sounds interesting what you wrote about further generalization. If you have some concrete idea what you would like to see I would be more than happy to help implementing them.
I also have some further ideas for this part of the code:
For example, my current pattern would also work, when there are multiple possible results from the swithc (ResultVector.size() > 1 cases).
Each result, where all the cases "guarding" that result could be replaced to the current pattern. Even partial replacements would be possible. (I am not exactly sure if it would be profitable with switches with many possible values. It would need further discussion).