[AMDGPU] In instruction selector, allow copy from physical reg to s1
In planned calling convention update, i1 arguments/returns are assigned to SGPRs without being promoted to i32. We need to update the instruction selector to allow copy from physical reg to s1 destination.
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-amdgpu
Author: Jun Wang (jwanggit86)
Changes
In planned calling convention update, i1 arguments/returns are assigned to SGPRs without being promoted to i32. We need to update the instruction selector to allow copy from physical reg to s1 destination.
Full diff: https://github.com/llvm/llvm-project/pull/96157.diff
1 Files Affected:
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+10)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
index 03e2d622dd319..66b5beb3243be 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -131,6 +131,16 @@ bool AMDGPUInstructionSelector::selectCOPY(MachineInstr &I) const {
Register SrcReg = Src.getReg();
if (isVCC(DstReg, *MRI)) {
+ // In planned update of calling convention, i1 arguments/returns are
+ // assigned to SGPRs without promoting to i32. The following if statement
+ // allows insturctions such as "%0:sreg_64_xexec(s1) = COPY $sgpr4_sgpr5"
+ // to be accepted.
+ if (SrcReg.isPhysical() && SrcReg != AMDGPU::SCC) {
+ const TargetRegisterClass *DstRC = MRI->getRegClassOrNull(DstReg);
+ if (DstRC)
+ return DstRC->contains(SrcReg);
+ }
+
if (SrcReg == AMDGPU::SCC) {
const TargetRegisterClass *RC
= TRI.getConstrainedRegClassForOperand(Dst, *MRI);
isVCC needs to gain a new parameter for the partnered register in the copy. There shouldn't be any need to fixup the result after, or check register class membership
isVCC is used in a number of other places, e.g., inside selectG_UADDO_USUBO_UADDE_USUBE(). In those other contexts apparently they don't consider "the other" reg. I don't know if there even is another reg involved. So, would it be better to create a new function just for this particular case, ie., copy from phy reg to s1?
isVCCis used in a number of other places, e.g., insideselectG_UADDO_USUBO_UADDE_USUBE(). In those other contexts apparently they don't consider "the other" reg. I don't know if there even is another reg involved. So, would it be better to create a new function just for this particular case, ie., copy from phy reg to s1?
You could either add another variant for copies, or add an optional parameter for the paired register. The other uses do not need to consider physical registers
In the new commit I created a separate function just to handle copy from physical reg to VCC. However, I'm not sure this is an improvement over the previous version. Modifying the existing isVCC() would be more messy. One particular problem is that the check could have 3 different results: (1) it's not a copy to VCC we are looking for (2) it's a valid phy to VCC copy (3) it's not a valid phy to VCC copy.
If the main concern about the previous version is that getRegClassOrNull() is called after the reg class is already obtained in isVCC(), then maybe we can add an additional parameter to isVCC() to return the reg class.
In the new commit I created a separate function just to handle copy from physical reg to VCC. However, I'm not sure this is an improvement over the previous version. Modifying the existing
isVCC()would be more messy.
There can be 2 variants of isVCC.
One particular problem is that the check could have 3 different results: (1) it's not a copy to VCC we are looking for (2) it's a valid phy to VCC copy (3) it's not a valid phy to VCC copy.
The results are is a boolean / VCC / wave mask copy, or is not.
If the main concern about the previous version is that
getRegClassOrNull()is called after the reg class is already obtained inisVCC(), then maybe we can add an additional parameter toisVCC()to return the reg class.
You've added a completely new path, and left the old handling. The new version should rely on fixed isVCC checks using the type hint. You should not have to query whether a register belongs to a register class, only query what that register's class is.
You should not have to query whether a register belongs to a register class, only query what that register's class is.
For the problem we are dealing with here, ie. copy from a phys reg to a VCC reg, if you don't query the membership of the phys reg, how do you allow %0:vcc(s1) = COPY $sgpr0 and disallow %0:vcc(s1) = COPY $sgpr0_sgpr1, or vice versa?
@arsenm ping.
For the problem we are dealing with here, ie. copy from a phys reg to a VCC reg, if you don't query the membership of the phys reg, how do you allow
%0:vcc(s1) = COPY $sgpr0and disallow%0:vcc(s1) = COPY $sgpr0_sgpr1, or vice versa?
This depends on the wavesize. %0:vcc(s1) = COPY $sgpr0 is only valid for wave32, and %0:vcc(s1) = COPY $sgpr0_sgpr1 for wave64. We should reject in the machine verifier cases with the vcc bank and the wrong wavesize physical register
@arsenm ping.
@arsenm ping.
@arsenm @shiltian @Pierre-vh ping.
Can we get the whole patch that actually deals with i1 argument? Need to check how uniformity analysis will calculate such copies.