Specify correct rounding for sqrt
In accuracy.rst, correct rounding is required for add, sub, mul, and divide. I propose adding sqrt to this list. Note that sqrt is not listed later among the mathematical functions whose accuracy is not precisely defined.
Note that the IEEE Std 754-2019 for Floating-Point Arithmetic mandates correct rounding for squareRoot.
Is this something that's actually the case for existing libraries/devices? If so, there shouldn't be an issue with adding it.
By the way, is your proposal for this to apply to both real and complex inputs or just to real inputs?
Excellent point! My proposal is for this to apply just to real inputs. Given industry adoption of IEEE Std 754-2019, correct rounding is actually the case for existing libraries/devices.
Sometimes there is an option for less accuracy (and presumably higher performance). For example, CUDA in single precision requires compiling with -prec-sqrt=true, and division requires -prec-div=true, for correct rounding. Since this document requires correctly rounded division, it equally well ought to require correctly rounded square root. [Double precision CUDA divide and square root are always correctly rounded. See the "Mathematical Functions" section of the "CUDA C++ Programming Guide".]
I believe the existing accuracy requirements for element-wise arithmetic operations are only intended to apply to real inputs. For example, the usual 6 flop implementation of complex multiplication
(a + bi)(c + di) = (ac - bd) + (ad + bc)i
does not result in correctly rounded real and imaginary components. I do not believe this array standard means to prohibit this implementation, but this should be clarified.
I do not believe this array standard means to prohibit this implementation
Yes, I think this is correct. As the specification advocates for IEEE 754 compliance (with some caveats; e.g., subnormals) and IEEE 754 only applies to reals, we don't have an equivalent mandate for complex number operations. Accordingly, we should make this distinction explicit.
Re: sqrt. Yes, I think this is a reasonable recommendation. The main point for the arithmetic operations being correctly rounded is to limit error accumulation. As sqrt is a fundamental operation--and more fundamental than the various transcendentals--I agree that requiring correctly rounded behavior is reasonable and should be added.
For those interested in the accuracy of complex operators/functions, my colleagues and I have done a comparison with different compilers/libraries: complex.pdf
PR: https://github.com/data-apis/array-api/pull/882
By default, the CUDA compiler sets -prec-div=true, -prec-sqrt=true, and -ftz=false.
https://docs.nvidia.com/cuda/floating-point/index.html#compiler-flags
The CuPy library is compiled with -ftz=true (overriding the default for this particular flag).
Thanks, @leofang, for this info.
Note that for IEEE square root a subnormal result can never be produced. So, a hardware mode that flushes subnormal results to zero is irrelevant.