swift-numerics icon indicating copy to clipboard operation
swift-numerics copied to clipboard

`root` and `pow` don't handle `Int` arguments quite right

Open stephentyrone opened this issue 5 years ago • 5 comments

At present, the implementations of these operations convert their Int arguments to the floating-point type. This works fine for small integers (the common case), but isn't quite right for large integers (which will round on conversion). This is especially problematic because it can convert odd integers to even integers, which will change the sign of results when the floating-point argument is negative.

stephentyrone avatar Jun 25 '19 18:06 stephentyrone

Here’s one possible approach for pow:

static func pow(_ x: Float, _ n: Int) -> Float {
  if let p = Float(exactly: n) {
    return pow(x, p)
  }
  
  let p = (n < 0) ? Float(n).nextUp : Float(n).nextDown
  return pow(x, p) * pow(x, n - Int(p))
}

NevinBR avatar Nov 24 '19 01:11 NevinBR

…just dropping that implementation in, however, gives:

error: inlining 'transparent' functions forms circular loop

The last call, pow(x, n - Int(p)), is indeed recursive, so it would either need to be re-written as a loop, or the function would have to be made non-@_transparent, or the recursive part could be moved into a separate helper function.

NevinBR avatar Nov 24 '19 20:11 NevinBR

pow fixed here: https://github.com/apple/swift-numerics/pull/83

Keeping this open for root.

stephentyrone avatar Nov 30 '19 04:11 stephentyrone

Some ideas for root:

1. Use a higher-precision type for the calculation, then convert back to Float at the end.

2. Take the significand bits of r = 1/Float(n) as an integer, and full-width multiply by n. Subtracting this from the appropriate power of 2 gives the error of r in nths of an ulp. It should be sufficient to convert both numerator and denominator of that error to Float and divide to get an approximation of the actual error δ. Now x1/n = xr+δ = xrxδ = pow(x, r) * pow(x, δ).

3. Similar to 2, but recognizing δ will often be so small that xδ is nearly 1, hence representing it as Float will lose most of the precision. It would be better to calculate ε = xδ-1 and use that to find xrxδ = xr(1 + ε) = xr + xrε. One way to do so is by expm1(log(x)*δ).

NevinBR avatar Dec 02 '19 03:12 NevinBR

root doesn't have issues the same way that pow did; it can be a few ulps off, but will never be catastrophically wrong.

I would like to make it sub-ulp accurate in the future (in particular, this means that exact powers get exact answers), but that can't be done without building our own implementation of exp and log (some host libc's don't provide that level of accuracy; even Darwin doesn't for Float80). Without having the necessary extra-precise building blocks in place, using xʳ + xʳε doesn't let you get a sub-ulp result anyway.

stephentyrone avatar Dec 02 '19 14:12 stephentyrone