uiCA icon indicating copy to clipboard operation
uiCA copied to clipboard

simulation inaccuracy: missed dep-breaking of pcmpeq

Open amonakov opened this issue 2 years ago • 0 comments

Integer pcmpeq* with source=dest sets destination to all-ones without dependency on source (but still occupies an execution unit). For example, the following loop runs at one cycle per iteration on Skylake, while uiCA predicts two:

loop:
vpcmpeqd xmm0, xmm0, xmm0
vpor xmm0, xmm0, xmm0
dec ecx
jnz loop

amonakov avatar Jun 27 '23 16:06 amonakov