runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Handle more than 64 registers - Part 1

Open kunalspathak opened this issue 1 year ago • 10 comments

The general feedback for https://github.com/dotnet/runtime/pull/98258 was to come up with smaller PRs concentrated around LSRA. This is part 1 of that.

For Arm64, this PR changes the typedef unsigned __int64 regMaskTP to `typedef

typedef struct _regMaskTP
{
  unsigned __int64 low;
} regMaskTP;

A version of PopCount and BitOperations has been added next to regMaskTP struct definition.

Most of the method implementation is pulled from https://github.com/dotnet/runtime/pull/96196.

kunalspathak avatar May 06 '24 22:05 kunalspathak

@dotnet/jit-contrib

kunalspathak avatar May 06 '24 22:05 kunalspathak

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

This is expected to be zero TP diff, so I will investigate why it is causing 2% regression. Possibly I missed updating something:

image

kunalspathak avatar May 07 '24 02:05 kunalspathak

for windows/arm64 crossgen2 collection, here is the distribution. Will take a look

??$select@$0A@@RegisterSelection@LinearScan@@QEAA?AU_regMaskAll@@PEAVInterval@@PEAVRefPosition@@@Z                                                           : 7371109602  : NA       : 24.12% : +4.3652%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@PEAVInterval@@IW4RefType@@PEAUGenTree@@U_regMaskAll@@I@Z                                                    : 2298414560  : NA       : 7.52%  : +1.3611%
?processKills@LinearScan@@AEAAXPEAVRefPosition@@@Z                                                                                                           : 1228183017  : NA       : 4.02%  : +0.7273%
?BuildUse@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@U_regMaskAll@@H@Z                                                                                    : 976319605   : NA       : 3.19%  : +0.5782%
?mergeRegisterPreferences@Interval@@QEAAXU_regMaskAll@@@Z                                                                                                    : 732180225   : NA       : 2.40%  : +0.4336%
?BuildDef@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@U_regMaskAll@@H@Z                                                                                    : 712437498   : NA       : 2.33%  : +0.4219%
?freeRegisters@LinearScan@@AEAAXU_regMaskAll@@@Z                                                                                                             : 598599174   : NA       : 1.96%  : +0.3545%
?buildKillPositionsForNode@LinearScan@@AEAA_NPEAUGenTree@@IU_regMaskAll@@@Z                                                                                  : 419647064   : NA       : 1.37%  : +0.2485%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@W4_regNumber_enum@@IW4RefType@@PEAUGenTree@@U_regMaskAll@@@Z                                                : 388348628   : NA       : 1.27%  : +0.2300%
?associateRefPosWithInterval@LinearScan@@AEAAXPEAVRefPosition@@@Z                                                                                            : 256572045   : +23.46%  : 0.84%  : +0.1519%
?gtGetRegMask@GenTree@@QEBA?AU_regMaskAll@@XZ                                                                                                                : 247446771   : NA       : 0.81%  : +0.1465%
?emitIns_Call@emitter@@QEAAXW4EmitCallType@1@PEAUCORINFO_METHOD_STRUCT_@@PEAX_JW4emitAttr@@4AEBQEA_KU_regMaskAll@@6AEBVDebugInfo@@W4_regNumber_enum@@8I3_N@Z : 229060387   : NA       : 0.75%  : +0.1357%
?addKillForRegs@LinearScan@@AEAAXU_regMaskAll@@I@Z                                                                                                           : 103418880   : NA       : 0.34%  : +0.0612%
?resolveEdge@LinearScan@@QEAAXPEAUBasicBlock@@0W4ResolveType@1@AEBQEA_KU_regMaskAll@@@Z                                                                      : 92569359    : NA       : 0.30%  : +0.0548%
?BuildDefs@LinearScan@@AEAAXPEAUGenTree@@HU_regMaskAll@@@Z                                                                                                   : 85318812    : NA       : 0.28%  : +0.0505%
?genBuildRegPairsStack@CodeGen@@KAXU_regMaskAll@@PEAV?$ArrayStack@URegPair@CodeGen@@@@@Z                                                                     : 75277373    : NA       : 0.25%  : +0.0446%
?emitEncodeCallGCregs@emitter@@CAXU_regMaskAll@@PEAUinstrDesc@1@@Z                                                                                           : 73781054    : NA       : 0.24%  : +0.0437%
?BuildOperandUses@LinearScan@@AEAAHPEAUGenTree@@U_regMaskAll@@@Z                                                                                             : 68804619    : NA       : 0.23%  : +0.0407%
?BuildNode@LinearScan@@AEAAHPEAUGenTree@@@Z                                                                                                                  : 54306503    : +4.65%   : 0.18%  : +0.0322%
?emitAddLabel@emitter@@AEAAPEAXAEBQEA_KU_regMaskAll@@1@Z                                                                                                     : 51155804    : NA       : 0.17%  : +0.0303%
?assignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                              : 50356789    : +10.96%  : 0.16%  : +0.0298%
?emitCreatePlaceholderIG@emitter@@QEAAXW4insGroupPlaceholderType@@PEAUBasicBlock@@AEBQEA_KU_regMaskAll@@3_N@Z                                                : 46678609    : NA       : 0.15%  : +0.0276%
?BuildAddrUses@LinearScan@@AEAAHPEAUGenTree@@U_regMaskAll@@@Z                                                                                                : 46376391    : NA       : 0.15%  : +0.0275%
?emitGCregDeadSet@emitter@@QEAAXW4GCtype@@U_regMaskAll@@PEAE@Z                                                                                               : 37957388    : NA       : 0.12%  : +0.0225%
?getMatchingConstants@LinearScan@@AEAA?AU_regMaskAll@@U2@PEAVInterval@@PEAVRefPosition@@@Z                                                                   : 36052458    : NA       : 0.12%  : +0.0214%
??$processBlockEndAllocation@$00@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                         : -31931508   : -99.99%  : 0.10%  : -0.0189%
?emitGCregDeadSet@emitter@@QEAAXW4GCtype@@_KPEAE@Z                                                                                                           : -37957388   : -100.00% : 0.12%  : -0.0225%
?getMatchingConstants@LinearScan@@AEAA_K_KPEAVInterval@@PEAVRefPosition@@@Z                                                                                  : -40340140   : -100.00% : 0.13%  : -0.0239%
?emitCreatePlaceholderIG@emitter@@QEAAXW4insGroupPlaceholderType@@PEAUBasicBlock@@AEBQEA_K_K3_N@Z                                                            : -45166929   : -100.00% : 0.15%  : -0.0267%
?BuildAddrUses@LinearScan@@AEAAHPEAUGenTree@@_K@Z                                                                                                            : -46376391   : -100.00% : 0.15%  : -0.0275%
?emitAddLabel@emitter@@AEAAPEAXAEBQEA_K_K1@Z                                                                                                                 : -51155804   : -100.00% : 0.17%  : -0.0303%
?updateAssignedInterval@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                     : -51531940   : -7.61%   : 0.17%  : -0.0305%
?buildUpperVectorRestoreRefPosition@LinearScan@@AEAAXPEAVInterval@@IPEAUGenTree@@_NI@Z                                                                       : -62330666   : -100.00% : 0.20%  : -0.0369%
?BuildOperandUses@LinearScan@@AEAAHPEAUGenTree@@_K@Z                                                                                                         : -68804619   : -100.00% : 0.23%  : -0.0407%
?addKillForRegs@LinearScan@@AEAAX_KI@Z                                                                                                                       : -70669568   : -100.00% : 0.23%  : -0.0419%
?emitEncodeCallGCregs@emitter@@CAX_KPEAUinstrDesc@1@@Z                                                                                                       : -74971071   : -100.00% : 0.25%  : -0.0444%
?genBuildRegPairsStack@CodeGen@@KAX_KPEAV?$ArrayStack@URegPair@CodeGen@@@@@Z                                                                                 : -75003566   : -100.00% : 0.25%  : -0.0444%
?processBlockStartLocations@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                              : -80628252   : -5.11%   : 0.26%  : -0.0477%
?BuildDefs@LinearScan@@AEAAXPEAUGenTree@@H_K@Z                                                                                                               : -85318812   : -100.00% : 0.28%  : -0.0505%
?resolveEdge@LinearScan@@QEAAXPEAUBasicBlock@@0W4ResolveType@1@AEBQEA_K_K@Z                                                                                  : -91554003   : -100.00% : 0.30%  : -0.0542%
?gtGetRegMask@GenTree@@QEBA_KXZ                                                                                                                              : -181640807  : -100.00% : 0.59%  : -0.1076%
?emitIns_Call@emitter@@QEAAXW4EmitCallType@1@PEAUCORINFO_METHOD_STRUCT_@@PEAX_JW4emitAttr@@4AEBQEA_K_K6AEBVDebugInfo@@W4_regNumber_enum@@8I3_N@Z             : -229060387  : -100.00% : 0.75%  : -0.1357%
?updateRegisterPreferences@Interval@@QEAAX_K@Z                                                                                                               : -240144624  : -100.00% : 0.79%  : -0.1422%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@W4_regNumber_enum@@IW4RefType@@PEAUGenTree@@_K@Z                                                            : -374957296  : -100.00% : 1.23%  : -0.2221%
?buildKillPositionsForNode@LinearScan@@AEAA_NPEAUGenTree@@I_K@Z                                                                                              : -390519208  : -100.00% : 1.28%  : -0.2313%
?freeRegisters@LinearScan@@AEAAX_K@Z                                                                                                                         : -598599174  : -100.00% : 1.96%  : -0.3545%
?BuildDef@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@_KH@Z                                                                                                : -712437498  : -100.00% : 2.33%  : -0.4219%
?BuildUse@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@_KH@Z                                                                                                : -927590191  : -100.00% : 3.04%  : -0.5493%
??$allocateRegisters@$0A@@LinearScan@@QEAAXXZ                                                                                                                : -1375837725 : -19.43%  : 4.50%  : -0.8148%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@PEAVInterval@@IW4RefType@@PEAUGenTree@@_KI@Z                                                                : -1693247150 : -100.00% : 5.54%  : -1.0028%
??$select@$0A@@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z                                                                         : -6078036133 : -100.00% : 19.89% : -3.5995%

kunalspathak avatar May 07 '24 04:05 kunalspathak

The culprit was using PopCount() in genMaxOneBit() and genExactlyOneBit(). After fixing it, the regression drops to 0.5%. The remaining regression is just scattered around because of various factors and is not related to any specific pattern.

Base: 168859757414, Diff: 169865447753, +0.5956%

??$select@$0A@@RegisterSelection@LinearScan@@QEAA?AUregMaskTP@@PEAVInterval@@PEAVRefPosition@@@Z                                                           : 6583187619  : NA       : 22.90% : +3.8986%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@PEAVInterval@@IW4RefType@@PEAUGenTree@@UregMaskTP@@I@Z                                                    : 1760063728  : NA       : 6.12%  : +1.0423%
?processKills@LinearScan@@AEAAXPEAVRefPosition@@@Z                                                                                                         : 1228183017  : NA       : 4.27%  : +0.7273%
?BuildUse@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@UregMaskTP@@H@Z                                                                                    : 930438698   : NA       : 3.24%  : +0.5510%
?BuildDef@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@UregMaskTP@@H@Z                                                                                    : 712437498   : NA       : 2.48%  : +0.4219%
?mergeRegisterPreferences@Interval@@QEAAXUregMaskTP@@@Z                                                                                                    : 711153378   : NA       : 2.47%  : +0.4212%
?freeRegisters@LinearScan@@AEAAXUregMaskTP@@@Z                                                                                                             : 598599174   : NA       : 2.08%  : +0.3545%
?buildKillPositionsForNode@LinearScan@@AEAA_NPEAUGenTree@@IUregMaskTP@@@Z                                                                                  : 419548284   : NA       : 1.46%  : +0.2485%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@W4_regNumber_enum@@IW4RefType@@PEAUGenTree@@UregMaskTP@@@Z                                                : 388348628   : NA       : 1.35%  : +0.2300%
?emitIns_Call@emitter@@QEAAXW4EmitCallType@1@PEAUCORINFO_METHOD_STRUCT_@@PEAX_JW4emitAttr@@4AEBQEA_KUregMaskTP@@6AEBVDebugInfo@@W4_regNumber_enum@@8I3_N@Z : 229060387   : NA       : 0.80%  : +0.1357%
?gtGetRegMask@GenTree@@QEBA?AUregMaskTP@@XZ                                                                                                                : 191014321   : NA       : 0.66%  : +0.1131%
?resolveEdge@LinearScan@@QEAAXPEAUBasicBlock@@0W4ResolveType@1@AEBQEA_KUregMaskTP@@@Z                                                                      : 92569359    : NA       : 0.32%  : +0.0548%
?BuildDefs@LinearScan@@AEAAXPEAUGenTree@@HUregMaskTP@@@Z                                                                                                   : 85318812    : NA       : 0.30%  : +0.0505%
?genBuildRegPairsStack@CodeGen@@KAXUregMaskTP@@PEAV?$ArrayStack@URegPair@CodeGen@@@@@Z                                                                     : 75277373    : NA       : 0.26%  : +0.0446%
?associateRefPosWithInterval@LinearScan@@AEAAXPEAVRefPosition@@@Z                                                                                          : 74404320    : +6.80%   : 0.26%  : +0.0441%
?emitEncodeCallGCregs@emitter@@CAXUregMaskTP@@PEAUinstrDesc@1@@Z                                                                                           : 73781054    : NA       : 0.26%  : +0.0437%
?addKillForRegs@LinearScan@@AEAAXUregMaskTP@@I@Z                                                                                                           : 70669568    : NA       : 0.25%  : +0.0419%
?BuildOperandUses@LinearScan@@AEAAHPEAUGenTree@@UregMaskTP@@@Z                                                                                             : 68804619    : NA       : 0.24%  : +0.0407%
?BuildNode@LinearScan@@AEAAHPEAUGenTree@@@Z                                                                                                                : 54306503    : +4.65%   : 0.19%  : +0.0322%
?emitAddLabel@emitter@@AEAAPEAXAEBQEA_KUregMaskTP@@1@Z                                                                                                     : 51155804    : NA       : 0.18%  : +0.0303%
?assignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                            : 50356789    : +10.96%  : 0.18%  : +0.0298%
?BuildAddrUses@LinearScan@@AEAAHPEAUGenTree@@UregMaskTP@@@Z                                                                                                : 46376391    : NA       : 0.16%  : +0.0275%
?emitCreatePlaceholderIG@emitter@@QEAAXW4insGroupPlaceholderType@@PEAUBasicBlock@@AEBQEA_KUregMaskTP@@3_N@Z                                                : 45166929    : NA       : 0.16%  : +0.0267%
?emitGCregDeadSet@emitter@@QEAAXW4GCtype@@UregMaskTP@@PEAE@Z                                                                                               : 37957388    : NA       : 0.13%  : +0.0225%
?getMatchingConstants@LinearScan@@AEAA?AUregMaskTP@@U2@PEAVInterval@@PEAVRefPosition@@@Z                                                                   : 36052458    : NA       : 0.13%  : +0.0214%
??$processBlockEndAllocation@$00@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                       : -31931508   : -99.99%  : 0.11%  : -0.0189%
?emitGCregDeadSet@emitter@@QEAAXW4GCtype@@_KPEAE@Z                                                                                                         : -37957388   : -100.00% : 0.13%  : -0.0225%
?getMatchingConstants@LinearScan@@AEAA_K_KPEAVInterval@@PEAVRefPosition@@@Z                                                                                : -40340140   : -100.00% : 0.14%  : -0.0239%
?emitCreatePlaceholderIG@emitter@@QEAAXW4insGroupPlaceholderType@@PEAUBasicBlock@@AEBQEA_K_K3_N@Z                                                          : -45166929   : -100.00% : 0.16%  : -0.0267%
?BuildAddrUses@LinearScan@@AEAAHPEAUGenTree@@_K@Z                                                                                                          : -46376391   : -100.00% : 0.16%  : -0.0275%
?emitAddLabel@emitter@@AEAAPEAXAEBQEA_K_K1@Z                                                                                                               : -51155804   : -100.00% : 0.18%  : -0.0303%
?updateAssignedInterval@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                   : -51531940   : -7.61%   : 0.18%  : -0.0305%
?BuildOperandUses@LinearScan@@AEAAHPEAUGenTree@@_K@Z                                                                                                       : -68804619   : -100.00% : 0.24%  : -0.0407%
?addKillForRegs@LinearScan@@AEAAX_KI@Z                                                                                                                     : -70669568   : -100.00% : 0.25%  : -0.0419%
?emitEncodeCallGCregs@emitter@@CAX_KPEAUinstrDesc@1@@Z                                                                                                     : -74971071   : -100.00% : 0.26%  : -0.0444%
?genBuildRegPairsStack@CodeGen@@KAX_KPEAV?$ArrayStack@URegPair@CodeGen@@@@@Z                                                                               : -75003566   : -100.00% : 0.26%  : -0.0444%
?processBlockStartLocations@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                            : -80628252   : -5.11%   : 0.28%  : -0.0477%
?BuildDefs@LinearScan@@AEAAXPEAUGenTree@@H_K@Z                                                                                                             : -85318812   : -100.00% : 0.30%  : -0.0505%
?resolveEdge@LinearScan@@QEAAXPEAUBasicBlock@@0W4ResolveType@1@AEBQEA_K_K@Z                                                                                : -91554003   : -100.00% : 0.32%  : -0.0542%
?gtGetRegMask@GenTree@@QEBA_KXZ                                                                                                                            : -181640807  : -100.00% : 0.63%  : -0.1076%
?emitIns_Call@emitter@@QEAAXW4EmitCallType@1@PEAUCORINFO_METHOD_STRUCT_@@PEAX_JW4emitAttr@@4AEBQEA_K_K6AEBVDebugInfo@@W4_regNumber_enum@@8I3_N@Z           : -229060387  : -100.00% : 0.80%  : -0.1357%
?updateRegisterPreferences@Interval@@QEAAX_K@Z                                                                                                             : -240144624  : -100.00% : 0.84%  : -0.1422%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@W4_regNumber_enum@@IW4RefType@@PEAUGenTree@@_K@Z                                                          : -374957296  : -100.00% : 1.30%  : -0.2221%
?buildKillPositionsForNode@LinearScan@@AEAA_NPEAUGenTree@@I_K@Z                                                                                            : -390519208  : -100.00% : 1.36%  : -0.2313%
?freeRegisters@LinearScan@@AEAAX_K@Z                                                                                                                       : -598599174  : -100.00% : 2.08%  : -0.3545%
?BuildDef@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@_KH@Z                                                                                              : -712437498  : -100.00% : 2.48%  : -0.4219%
?BuildUse@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@_KH@Z                                                                                              : -927590191  : -100.00% : 3.23%  : -0.5493%
??$allocateRegisters@$0A@@LinearScan@@QEAAXXZ                                                                                                              : -1375837725 : -19.43%  : 4.79%  : -0.8148%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@PEAVInterval@@IW4RefType@@PEAUGenTree@@_KI@Z                                                              : -1693247150 : -100.00% : 5.89%  : -1.0028%
??$select@$0A@@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z                                                                       : -6078036133 : -100.00% : 21.15% : -3.5995%

kunalspathak avatar May 09 '24 00:05 kunalspathak

After fixing it, the regression drops to 0.5%. The remaining regression is just scattered around because of various factors and is not related to any specific pattern.

Yeah, seems to just be various MSVC regressions due to now having a struct instead of primitive type. Clang seems to do a little bit better. Not much we can do about that I think.

jakobbotsch avatar May 10 '24 09:05 jakobbotsch

We might consider switching regMaskTP to a struct everywhere instead of having it as a primitive, to have it unified everywhere. I would personally prefer it, even if it causes MSVC to pessimize codegen slightly. Thoughts @dotnet/jit-contrib?

Alternatively we could move extra operations to live on some RegMaskOps type that operates on regMaskTP. That would probably allow all the client code to stay unified as well, even with regMaskTP typedeffed to a primitive.

jakobbotsch avatar May 10 '24 09:05 jakobbotsch

We might consider switching regMaskTP to a struct everywhere instead of having it as a primitive, to have it unified everywhere. I would personally prefer it, even if it causes MSVC to pessimize codegen slightly. Thoughts @dotnet/jit-contrib?

I don't mind doing it and was advocate of similar idea back in https://github.com/dotnet/runtime/pull/98258 because with that, in future, when we add APX support for Intel, enabling the "handling of 64 registers".

Edit: With that said, given that https://github.com/dotnet/runtime/pull/98258 is already out for couple of months now and it is a critical work that is needed to make other progress for SVE, I would like to concentrate on enabling it with as minimal work as needed (just for arm64) and have it enable for other platforms as a follow up PRs once we complete the "predicate register" work.

kunalspathak avatar May 10 '24 14:05 kunalspathak

/azp run runtime-coreclr superpmi-diffs

kunalspathak avatar May 10 '24 20:05 kunalspathak

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar May 10 '24 20:05 azure-pipelines[bot]

image

kunalspathak avatar May 11 '24 00:05 kunalspathak

/azp run runtime-coreclr superpmi-replay

kunalspathak avatar May 11 '24 00:05 kunalspathak

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar May 11 '24 00:05 azure-pipelines[bot]