toolchain icon indicating copy to clipboard operation
toolchain copied to clipboard

Improved the precision (and probably speed) of cbrtf.

Open ZERICO2005 opened this issue 1 year ago • 4 comments

Minimum precision: 21.415037 bits at both +4.251052856e+02 and +7.969426579e+17 for cbrtf.

Precision is calculated as log2(fabs( true_cbrt(x) - approx_cbrt(x) )) - ilogb(fabs( true_cbrt(x) ))

The previous method of powf(x, 1.0f / 3.0f) had a minimum precision of 21.0 bits at +4.037431946e+02, and a minimum precision 19.299560 bits at +1.187444200e+07, slowly losing precision at larger magnitudes.

Precision tested with 32bit floats on x86_64

ZERICO2005 avatar Oct 10 '24 23:10 ZERICO2005

I confirmed using the ez80sf tester that this does increase the number of passed tests to about 42% compared to the old function's 11% (when given non-negative finite inputs, since that's all the old function supported). It's not perfect, but it only claims to be 21.0 bits minimum precision so that's to be expected. I did notice for some huge inputs it returns NaN, though.

calc84maniac avatar Oct 11 '24 02:10 calc84maniac

powf(x, 1/3.f); is smaller tho

mateoconlechuga avatar Oct 11 '24 23:10 mateoconlechuga

powf(x, 1/3.f); is smaller tho

It's also semantically different (especially for negative inputs), so if someone wants to save space (assuming they're already using powf elsewhere) they can call powf like you just specified.

calc84maniac avatar Oct 11 '24 23:10 calc84maniac

powf(x, 1/3.f); is smaller tho

True, although it should probably be changed to this

if ( x == 0.0f || !isfinite(x) ) {
	return x;
}
return copysignf(powf(fabsf(x), 1/3.f), x);

ZERICO2005 avatar Oct 11 '24 23:10 ZERICO2005

I will close this pull request as calc84maniac has written an assembly version of this function.

ZERICO2005 avatar Nov 21 '24 22:11 ZERICO2005