hhvm
hhvm copied to clipboard
Aarch64 - update EqDbl to use SIMD/FP instructions
This change updates the EqDbl and NeqDbl sequences to use different instructions. It reduces the number of instructions needed and removes the dependency on the status register.
Before
(29) t8:Bool = EqDbl t7:Dbl, t5:Dbl
Main:
0x51003aec d53b4210 mrs x16, nzcv
0x51003af0 1e602020 fcmp d1, d0
0x51003af4 da9f13f2 csetm x18, eq
0x51003af8 9e670240 fmov d0, x18
0x51003afc d51b4210 msr nzcv, x16
0x51003b00 9e660000 fmov x0, d0
0x51003b04 53001c00 uxtb w0, w0
0x51003b08 12000000 and w0, w0, #0x1
After
(29) t8:Bool = EqDbl t7:Dbl, t5:Dbl
Main:
0x1460407c 5e60e420 fcmeq d0, d1, d0
0x14604080 9e660000 fmov x0, d0
0x14604084 12000000 and w0, w0, #0x1
This also updates the vixl disassembler to recognize two new instruction formats: Advanced SIMD Scalar Three Same, and Advanced SIMD Scalar two-register Misc. Whenever the enumeration clashed with an existing definition a prefix was added. See STS, and STM respectively.
The regression suite was run with 6 option sets. No additional failures were observed.
@mxw - Hi Max, Can you take a look at this one? Thanks.
@mxw has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Hi @swalk-cavium, This patch doesn't create any more unit test failures, and it passed all of OSS performance test suite.
@swalk-cavium updated the pull request - view changes - changes since last import
@mxw - MOP results the same with only 1 copy, so updated pull request.
@mxw has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Before I accept this, I'd like someone else to give a second opinion on the change (not just on test results) to confirm it doesn't affect the behavior of the instruction in any way.
@mxw - Hi Max, Here's how I tested the sequences before incorporating in hhvm https://pastebin.com/PC6Ym7ZY
@mxw - Here's the table I used for the unordered comparison case. https://pastebin.com/psgHNebB
SNaN - signalling NaN, QNaN - quiet NaN
@mxw - I think I might have to update this to make it check the hardware capabilities. I just noticed at least one of the forms says, ARMv8.2. Something similar to what we did for the LSE Atomics.
@mxw - Um, disregard the previous comment. SIMD instructions are required in the ARM SBSA (Server Base System Architecture).
@swalk-cavium updated the pull request - view changes - changes since last import
@mxw has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@swalk-cavium—Cool, thanks for this change. I'll land it as soon as internal tests pass.
@mxw has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@mxw - Hi Max, Any word on your testing infrastructure issue? Can this one land?