hhvm
hhvm copied to clipboard
Add missing vzeroupper to memset-avx2
This diff ensures vzerouppper is executed when memset count is >64 and <128 bytes.
The guidance from Intel is to use vzeroupper when transitioning from AVX to SSE code in order to avoid transition penalties:
"When the upper 128 bits of the YMM registers are set to zero by the vzeroupper instruction, the hardware does not need to save those values, so the hardware assists do not occur. The vzeroupper instruction must be used after 256-bit Intel AVX code and before Intel SSE code, which will remove both the save and the restore operations. Zeroing out the YMM registers with other methods, such as with XORs, will not prevent AVX-SSE transition penalties." https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties
No expensive transitions were detected when using the Intel SDE AVX/SSE transition checker.