llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

[AArch64] Use isKnownNonZero to optimize eligible compares to cmn

Open AZero13 opened this issue 1 year ago • 4 comments

Turning a cmp into cmn saves an extra mov and negate instruction, so take that into account when choosing when to flip the compare operands.

Also do not consider right-hand operands whose absolute value can be encoded into a cmn.

AZero13 avatar Jun 21 '24 19:06 AZero13

@llvm/pr-subscribers-backend-aarch64

Author: AtariDreams (AtariDreams)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/96349.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+8-6)
  • (modified) llvm/test/CodeGen/AArch64/cmp-chains.ll (+31)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 0c834acacdca9..6b6cfe7e82a1e 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -3372,9 +3372,10 @@ static bool isLegalArithImmed(uint64_t C) {
 // So, finally, the only LLVM-native comparisons that don't mention C and V
 // are SETEQ and SETNE. They're the only ones we can safely use CMN for in
 // the absence of information about op2.
-static bool isCMN(SDValue Op, ISD::CondCode CC) {
+static bool isCMN(SDValue Op, ISD::CondCode CC, SelectionDAG &DAG) {
   return Op.getOpcode() == ISD::SUB && isNullConstant(Op.getOperand(0)) &&
-         (CC == ISD::SETEQ || CC == ISD::SETNE);
+         (CC == ISD::SETEQ || CC == ISD::SETNE ||
+          DAG.isKnownNeverZero(Op.getOperand(1)));
 }
 
 static SDValue emitStrictFPComparison(SDValue LHS, SDValue RHS, const SDLoc &dl,
@@ -3419,11 +3420,11 @@ static SDValue emitComparison(SDValue LHS, SDValue RHS, ISD::CondCode CC,
   // register to WZR/XZR if it ends up being unused.
   unsigned Opcode = AArch64ISD::SUBS;
 
-  if (isCMN(RHS, CC)) {
+  if (isCMN(RHS, CC, DAG)) {
     // Can we combine a (CMP op1, (sub 0, op2) into a CMN instruction ?
     Opcode = AArch64ISD::ADDS;
     RHS = RHS.getOperand(1);
-  } else if (isCMN(LHS, CC)) {
+  } else if (isCMN(LHS, CC, DAG)) {
     // As we are looking for EQ/NE compares, the operands can be commuted ; can
     // we combine a (CMP (sub 0, op1), op2) into a CMN instruction ?
     Opcode = AArch64ISD::ADDS;
@@ -3527,7 +3528,8 @@ static SDValue emitConditionalComparison(SDValue LHS, SDValue RHS,
     }
   } else if (RHS.getOpcode() == ISD::SUB) {
     SDValue SubOp0 = RHS.getOperand(0);
-    if (isNullConstant(SubOp0) && (CC == ISD::SETEQ || CC == ISD::SETNE)) {
+    if (isNullConstant(SubOp0) && (CC == ISD::SETEQ || CC == ISD::SETNE ||
+                                   DAG.isKnownNeverZero(RHS.getOperand(1)))) {
       // See emitComparison() on why we can only do this for SETEQ and SETNE.
       Opcode = AArch64ISD::CCMN;
       RHS = RHS.getOperand(1);
@@ -3848,7 +3850,7 @@ static SDValue getAArch64Cmp(SDValue LHS, SDValue RHS, ISD::CondCode CC,
   // can be turned into:
   //    cmp     w12, w11, lsl #1
   if (!isa<ConstantSDNode>(RHS) || !isLegalArithImmed(RHS->getAsZExtVal())) {
-    SDValue TheLHS = isCMN(LHS, CC) ? LHS.getOperand(1) : LHS;
+    SDValue TheLHS = isCMN(LHS, CC, DAG) ? LHS.getOperand(1) : LHS;
 
     if (getCmpOperandFoldingProfit(TheLHS) > getCmpOperandFoldingProfit(RHS)) {
       std::swap(LHS, RHS);
diff --git a/llvm/test/CodeGen/AArch64/cmp-chains.ll b/llvm/test/CodeGen/AArch64/cmp-chains.ll
index 14cb0c82b1c03..0713fdb7283a9 100644
--- a/llvm/test/CodeGen/AArch64/cmp-chains.ll
+++ b/llvm/test/CodeGen/AArch64/cmp-chains.ll
@@ -258,3 +258,34 @@ define i32 @neg_range_int(i32 %a, i32 %b, i32 %c) {
   ret i32 %retval.0
 }
 
+; (b > -3 || a < -(c | 1))
+define i32 @neg_range_int_cmn(i32 %a, i32 %b, i32 %c) {
+; SDISEL-LABEL: neg_range_int_cmn:
+; SDISEL:       // %bb.0:
+; SDISEL-NEXT:    orr w8, w2, #0x1
+; SDISEL-NEXT:    cmn w8, w0
+; SDISEL-NEXT:    ccmn w1, #3, #0, le
+; SDISEL-NEXT:    csel w0, w1, w0, gt
+; SDISEL-NEXT:    ret
+;
+; GISEL-LABEL: neg_range_int_cmn:
+; GISEL:       // %bb.0:
+; GISEL-NEXT:    orr w8, w2, #0x1
+; GISEL-NEXT:    cmn w1, #3
+; GISEL-NEXT:    neg w8, w8
+; GISEL-NEXT:    cset w9, gt
+; GISEL-NEXT:    cmp w8, w0
+; GISEL-NEXT:    cset w8, gt
+; GISEL-NEXT:    orr w8, w9, w8
+; GISEL-NEXT:    and w8, w8, #0x1
+; GISEL-NEXT:    tst w8, #0x1
+; GISEL-NEXT:    csel w0, w1, w0, ne
+; GISEL-NEXT:    ret
+  %or = or i32 %c, 1
+  %sub = sub nsw i32 0, %or
+  %cmp = icmp sgt i32 %b, -3
+  %cmp1 = icmp sgt i32 %sub, %a
+  %1 = select i1 %cmp, i1 true, i1 %cmp1
+  %ret = select i1 %1, i32 %b, i32 %a
+  ret i32 %ret
+}

llvmbot avatar Jun 21 '24 19:06 llvmbot

@arsenm What do you think of this?

AZero13 avatar Jun 23 '24 23:06 AZero13

@dtcxzyw does this have an adverse effect on compile time? I didn't find any when I profiled LLVM.

AZero13 avatar Jun 23 '24 23:06 AZero13

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

github-actions[bot] avatar Jun 29 '24 14:06 github-actions[bot]

@arsenm @davemgreen Thoughts?

AZero13 avatar Jul 06 '24 22:07 AZero13

@jonathandavies-arm What are your thoughts on this?

AZero13 avatar Jul 16 '24 22:07 AZero13

@davemgreen This should be ready now!

AZero13 avatar Jul 19 '24 00:07 AZero13

I think this looks OK now, and the test all look OK. Do you mind if we wait until after the release branch to commit, in case there is something still wrong?

It would be good to have test cases for lots of various cases too, even if they don't trigger - different signed conditions, cmn on the LHS/RHS, ccmn on the LHS/RHS etc. Thanks

If something goes wrong, we can revert it. However, I've tested and tested this and found no issue.

AZero13 avatar Jul 19 '24 06:07 AZero13

I'm working on an expansion to this but that will sadly have to wait until after the release branch, but this is the part that has been thoroughly tested

AZero13 avatar Jul 19 '24 06:07 AZero13

If something goes wrong, we can revert it. However, I've tested and tested this and found no issue.

This isn't really the point of adding tests.

arsenm avatar Jul 19 '24 07:07 arsenm

OK - I was just being careful as we are so close to the release and some of the issues can be subtle. I gave it a more thorough test and everything seems OK.

By "tested this and found no issue", does that include running a bootstrap and the llvm-test-suite? If so and you add a some extra test coverage then this LGTM.

davemgreen avatar Jul 19 '24 11:07 davemgreen

OK - I was just being careful as we are so close to the release and some of the issues can be subtle. I gave it a more thorough test and everything seems OK.

By "tested this and found no issue", does that include running a bootstrap and the llvm-test-suite? If so and you add a some extra test coverage then this LGTM.

Yes I ran the entire test suite.

AZero13 avatar Jul 19 '24 13:07 AZero13

OK. If that has succeeded and a bootstrap also does OK, then if you can add some more tests for various ccmn cases we can get this in.

davemgreen avatar Jul 20 '24 12:07 davemgreen

@davemgreen I have reverted this back because otherwise I'm gonna have to do another bootstrap test, and I think that change can be accepted after this one instead of part of this PR.

So this should be good to go!

AZero13 avatar Jul 22 '24 01:07 AZero13