highway icon indicating copy to clipboard operation
highway copied to clipboard

Missing unsigned integer => floating-point conversions

Open nielskm opened this issue 2 years ago • 1 comments

The current implementation of ConvertTo does not have an overload for unsigned integer to floating-point conversion. For my specific application I need to convert uint32 => float32 and uint64 => float64.

I'm willing to try implementing it myself, but suggestions are welcome as I am not (yet) an expert on CPU intrinsics.

nielskm avatar Aug 10 '22 07:08 nielskm

Hi @nielskm, sounds good, we'd welcome such a patch. For the general approach, SO has some discussion (see also links): https://stackoverflow.com/questions/34066228/how-to-perform-uint32-float-conversion-with-sse To understand what each intrinsic is doing, we can refer to the Intel guide. To translate to Highway, one approach is to search x86_128-inl.h for the intrinsic name and then see the corresponding Highway op name.

BTW if you go with the 2x cvtepi32 approach (instead of clearing the upper bit, cvtepi32, conditionally adding 2^31 as a float), you'd probably want MulAdd instead of mul_ps+add_ps.

Also, AVX-512 has _mm512_cvtepu32_ps for this, which is very helpful. An example of how to use this for 256-bit vectors when available:

HWY_API Vec256<double> ConvertTo(Full256<double> dd, const Vec256<int64_t> v) {
#if HWY_TARGET <= HWY_AVX3
  (void)dd;
  return Vec256<double>{_mm256_cvtepi64_pd(v.raw)};
#else

Hope this helps, happy to advise in more detail if you get stuck.

jan-wassenberg avatar Aug 10 '22 08:08 jan-wassenberg

Hi again @jan-wassenberg I have now made a patch which passes the checks. I have tested the x86 targets locally, but not the arm, rvv, and wasm. Are those tested in your pipelines?

nielskm avatar Aug 12 '22 07:08 nielskm

@nielskm wow, congratulations, this is impressive work - awesome that you've also managed to update RVV/SVE! I've left a few comments which are mostly minor style/naming. Yes, our internal pipeline will check all targets. Let's run it once you've updated the PR?

jan-wassenberg avatar Aug 15 '22 09:08 jan-wassenberg

@nielskm Nice, tests pass. Getting cross-platform SIMD in just one test-edit iteration is solid work :) Just waiting for an additional internal review now, then this will go in.

jan-wassenberg avatar Aug 16 '22 08:08 jan-wassenberg

@jan-wassenberg Great! Thanks for all the helpful comments :-)

nielskm avatar Aug 16 '22 16:08 nielskm