vrnd kernels for rvv are incorrect
The vrnd kernels for rvv work by converting a float to an int and back. This doesn't work in a few cases:
- For floats that are outside the range of an int32
- It loses infinity/NaN
I tried to partially fix the latter in issue in #7971.
@KaustubhIMG @ken-unger
Thanks for the note.
re: // TODO: We should also preserve NaN
FWIW this is something that Eigen also needs to do, and there is an ongoing PR where min/max with NaN propagation could perhaps be used for reference.
https://gitlab.com/libeigen/eigen/-/blob/8ac76af2902bedd9a075c32a67ec9f55d2e6b325/Eigen/src/Core/arch/RVV10/PacketMath.h#L1223
I see this same behavior (cast to int) in sse2, 0armv7, wasmsimd, and hexagon. Also hexagon qfloats do not have a formal NaN or inf. You can find implementations for arm and intel in wasmsimd. The basic idea is compare if the exp is large enough that the value is already an integer. (exp >= 23?) This includes exp = 255 for NaN and inf. Meanwhile proceed with casting to int (with rounding modes for ties). The basic one is away from zero for ties. 2.5 rounds to 3.0 -2.5 rounds to -3.0 3.5 rounds to 4.0 -3.5 rounds to -4.0 And convert back to float Then use the comparison of exponent to select the original float or the float that was convert to/from int. Most cpus have a way to select between 2 vectors using a mask. On ARM Neon bitselect On SSE4 vblend_ps
See Also |https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#explicit-fp-rounding-mode-intrinsics
For Hexagon I implemented f32-vrndne using the Neon method, which is roundf() style. Its apparently not the intended rounding.. it should be nearbyintf(), but the method would be similar for RVV.
- Compare to 8 million and generate a predicate, representing values that need rounding.
- Cast to int and back to float with applicable rounding.
- Use predicate to keep the original value if it is more than 8 million (already an integer)
The simplier, more efficient way, depends on IEEE math implementing 'even' rounding. Simply add 8 million and subtract it. vrndne could use that. vrndu, vrndd, vrndz could use the neon/hexagon method.