Oscar Smith

Results 353 comments of Oscar Smith

the BFloat16(::BigFloat) version of this will differ double rounding. I suggest looking at how the float16 conversation works in Julia

yes, but getting the elementary functions rounded correctly isn't a requirement (or easy), but correct rounding for arithmetic and conversion are relatively easy.

as previously mentioned this version has double rounding, which can be avoided by using the same algorithm Float16 uses, but this is better than not having the capability

Isn't this just 16 pairs of evalpolys of degrees 1 to 16?

Yeah, my suggestion was the hybrid of this looped verison with your unrolled version (where you precompute `cst` but otherwise leave it looped). I think that should vectorize well and...

in terms of speed, the generated function + nexprs approach will be really good. and I don't think the cache pressure/load times will be much worse than the vectorized evalpolys...

Can the Implicit Function theorem be used here?

is it also slower in the range where the old one was accurate?

should this be merged or is it outdated?

One thing worth noting is that although these techniques won't work for `Float64`, for `Float32`, there are a lot of options opened since you can do the internals in `Float64`...