llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

[AMDGPU] In instruction selector, allow copy from physical reg to s1

Open jwanggit86 opened this issue 1 year ago • 12 comments

In planned calling convention update, i1 arguments/returns are assigned to SGPRs without being promoted to i32. We need to update the instruction selector to allow copy from physical reg to s1 destination.

jwanggit86 avatar Jun 20 '24 10:06 jwanggit86

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-amdgpu

Author: Jun Wang (jwanggit86)

Changes

In planned calling convention update, i1 arguments/returns are assigned to SGPRs without being promoted to i32. We need to update the instruction selector to allow copy from physical reg to s1 destination.


Full diff: https://github.com/llvm/llvm-project/pull/96157.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+10)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
index 03e2d622dd319..66b5beb3243be 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -131,6 +131,16 @@ bool AMDGPUInstructionSelector::selectCOPY(MachineInstr &I) const {
   Register SrcReg = Src.getReg();
 
   if (isVCC(DstReg, *MRI)) {
+    // In planned update of calling convention, i1 arguments/returns are
+    // assigned to SGPRs without promoting to i32. The following if statement
+    // allows insturctions such as "%0:sreg_64_xexec(s1) = COPY $sgpr4_sgpr5"
+    // to be accepted.
+    if (SrcReg.isPhysical() && SrcReg != AMDGPU::SCC) {
+      const TargetRegisterClass *DstRC = MRI->getRegClassOrNull(DstReg);
+      if (DstRC)
+        return DstRC->contains(SrcReg);
+    }
+
     if (SrcReg == AMDGPU::SCC) {
       const TargetRegisterClass *RC
         = TRI.getConstrainedRegClassForOperand(Dst, *MRI);

llvmbot avatar Jun 20 '24 10:06 llvmbot

isVCC needs to gain a new parameter for the partnered register in the copy. There shouldn't be any need to fixup the result after, or check register class membership

isVCC is used in a number of other places, e.g., inside selectG_UADDO_USUBO_UADDE_USUBE(). In those other contexts apparently they don't consider "the other" reg. I don't know if there even is another reg involved. So, would it be better to create a new function just for this particular case, ie., copy from phy reg to s1?

jwanggit86 avatar Oct 08 '24 21:10 jwanggit86

isVCC is used in a number of other places, e.g., inside selectG_UADDO_USUBO_UADDE_USUBE(). In those other contexts apparently they don't consider "the other" reg. I don't know if there even is another reg involved. So, would it be better to create a new function just for this particular case, ie., copy from phy reg to s1?

You could either add another variant for copies, or add an optional parameter for the paired register. The other uses do not need to consider physical registers

arsenm avatar Oct 10 '24 20:10 arsenm

In the new commit I created a separate function just to handle copy from physical reg to VCC. However, I'm not sure this is an improvement over the previous version. Modifying the existing isVCC() would be more messy. One particular problem is that the check could have 3 different results: (1) it's not a copy to VCC we are looking for (2) it's a valid phy to VCC copy (3) it's not a valid phy to VCC copy.

If the main concern about the previous version is that getRegClassOrNull() is called after the reg class is already obtained in isVCC(), then maybe we can add an additional parameter to isVCC() to return the reg class.

jwanggit86 avatar Oct 11 '24 22:10 jwanggit86

In the new commit I created a separate function just to handle copy from physical reg to VCC. However, I'm not sure this is an improvement over the previous version. Modifying the existing isVCC() would be more messy.

There can be 2 variants of isVCC.

One particular problem is that the check could have 3 different results: (1) it's not a copy to VCC we are looking for (2) it's a valid phy to VCC copy (3) it's not a valid phy to VCC copy.

The results are is a boolean / VCC / wave mask copy, or is not.

If the main concern about the previous version is that getRegClassOrNull() is called after the reg class is already obtained in isVCC(), then maybe we can add an additional parameter to isVCC() to return the reg class.

You've added a completely new path, and left the old handling. The new version should rely on fixed isVCC checks using the type hint. You should not have to query whether a register belongs to a register class, only query what that register's class is.

arsenm avatar Oct 14 '24 16:10 arsenm

You should not have to query whether a register belongs to a register class, only query what that register's class is.

For the problem we are dealing with here, ie. copy from a phys reg to a VCC reg, if you don't query the membership of the phys reg, how do you allow %0:vcc(s1) = COPY $sgpr0 and disallow %0:vcc(s1) = COPY $sgpr0_sgpr1, or vice versa?

jwanggit86 avatar Oct 21 '24 21:10 jwanggit86

@arsenm ping.

jwanggit86 avatar Oct 30 '24 17:10 jwanggit86

For the problem we are dealing with here, ie. copy from a phys reg to a VCC reg, if you don't query the membership of the phys reg, how do you allow %0:vcc(s1) = COPY $sgpr0 and disallow %0:vcc(s1) = COPY $sgpr0_sgpr1, or vice versa?

This depends on the wavesize. %0:vcc(s1) = COPY $sgpr0 is only valid for wave32, and %0:vcc(s1) = COPY $sgpr0_sgpr1 for wave64. We should reject in the machine verifier cases with the vcc bank and the wrong wavesize physical register

arsenm avatar Oct 30 '24 21:10 arsenm

@arsenm ping.

jwanggit86 avatar Dec 06 '24 02:12 jwanggit86

@arsenm ping.

jwanggit86 avatar Feb 21 '25 19:02 jwanggit86

@arsenm @shiltian @Pierre-vh ping.

jwanggit86 avatar Mar 25 '25 17:03 jwanggit86

Can we get the whole patch that actually deals with i1 argument? Need to check how uniformity analysis will calculate such copies.

petar-avramovic avatar Jun 30 '25 10:06 petar-avramovic