FBGEMM icon indicating copy to clipboard operation
FBGEMM copied to clipboard

float conversion emulation routines

Open sjoerdmeijer opened this issue 1 year ago • 1 comments

I see several floating-point conversion routines, for example this float32 to float16 helper function:

https://github.com/pytorch/FBGEMM/blob/3070f88d0dce506f2cba7f2019ea8dfc491e5c3b/include/fbgemm/Types.h#L77

But most modern AArch64 CPUs (Armv8.2a and up) and I believe x86 too have native support for FP16, and have different instructions for up and down converts. I believe that whole function can be replaced with just one FCVT instruction. The different rounding modes should be supported too.

sjoerdmeijer avatar Aug 14 '24 17:08 sjoerdmeijer

I think the cpu_float2half_rn function is a reference implementation that intentionally implement the algorithm manually. Currently we rely on the compiler to do the optimized CPU float conversion (see line 222 and 232) if the compiler has fp16 data type extension and the CPU supports native fp16 conversion.

excelle08 avatar Aug 22 '24 22:08 excelle08