meetings icon indicating copy to clipboard operation
meetings copied to clipboard

SIMD traces for mulhi usage

Open jfbastien opened this issue 8 years ago • 4 comments

Action item: Intel folks to see in their traces how the instructions are used (variable or constants as inputs).

jfbastien avatar Nov 03 '17 21:11 jfbastien

As I recall, the issue was whether to restrict just one of the source operands to be constant for this instruction:

  __m128i _mm_mulhi_epi16 (__m128i a, __m128i b);  (PMULHW)

It would be helpful, if someone (James @jzern ?) could point out where in the Webp benchmark this instruction is used.

EDIT: I did a search for mulhi in the https://github.com/webmproject/libwebp repo and got a bunch of hits in the dsp directory. Are those the right ones to look at?

PeterJensen avatar Nov 03 '17 22:11 PeterJensen

On the portable-intrinsics branch there's examples for neon, sse2 and portable-intrinsics, the second value for all calls are constants. The NEON half of the portable intrinsics could be refined like dec_neon.c, it's using the same constant values as sse2 for convenience in the implementation.

https://chromium.googlesource.com/webm/libwebp/+/0af22e17d67e6b81fee6d42a53ce6f40aad416e1/src/dsp/dec_wasm.c#115 https://chromium.googlesource.com/webm/libwebp/+/0af22e17d67e6b81fee6d42a53ce6f40aad416e1/src/dsp/dec_neon.c#975 https://chromium.googlesource.com/webm/libwebp/+/0af22e17d67e6b81fee6d42a53ce6f40aad416e1/src/dsp/dec_sse2.c#88

jzern avatar Nov 03 '17 23:11 jzern

Thanks @jzern !

I was looking at the ARM NEON instruction manual for the VQDMULH instruction and didn't see that it requires one of the source operands to be constant. If both SEE and NEON support both operands being non-constant, a potential WASM instruction for mulhi might as well do that too, right? Maybe I didn't read the NEON documentation right. Here's the info I'm looking at:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489g/CJAJIIGG.html

(I couldn't find a permalink to the actual instruction, so you have to search for it :(

PeterJensen avatar Nov 03 '17 23:11 PeterJensen

I was looking at the ARM NEON instruction manual for the VQDMULH instruction and didn't see that it requires one of the source operands to be constant. If both SEE and NEON support both operands being non-constant, a potential WASM instruction for mulhi might as well do that too, right?

You're right NEON doesn't. The intrinsics do offer a scalar variant, though. So 2 non-constants is an option, one thing that needs to be considered is the range, however. With the doubling that the NEON does it forces one vector to 15 bits.

jzern avatar Nov 04 '17 01:11 jzern

SIMD proposal merged, closing as no longer relevant.

dtig avatar Oct 04 '22 23:10 dtig