hhvm icon indicating copy to clipboard operation
hhvm copied to clipboard

Add missing vzeroupper to memset-avx2

Open octmoraru opened this issue 7 years ago • 0 comments

This diff ensures vzerouppper is executed when memset count is >64 and <128 bytes.

The guidance from Intel is to use vzeroupper when transitioning from AVX to SSE code in order to avoid transition penalties:

"When the upper 128 bits of the YMM registers are set to zero by the vzeroupper instruction, the hardware does not need to save those values, so the hardware assists do not occur. The vzeroupper instruction must be used after 256-bit Intel AVX code and before Intel SSE code, which will remove both the save and the restore operations. Zeroing out the YMM registers with other methods, such as with XORs, will not prevent AVX-SSE transition penalties." https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties

No expensive transitions were detected when using the Intel SDE AVX/SSE transition checker.

octmoraru avatar Apr 07 '18 01:04 octmoraru