awkward
awkward copied to clipboard
np.real, np.imag, np.round, np.around can't take an Awkward Array as input
Version of Awkward Array
HEAD
Description and code to reproduce
This is more like a strange "gotcha" of NumPy, but np.real
and np.imag
are not ufuncs, unlike, for example, np.conjugate.
>>> np.real
<function real at 0x7e73131d8930>
>>> np.imag
<function imag at 0x7e73131d8b30>
>>> np.conjugate
<ufunc 'conjugate'>
Therefore, it would require some special handling to make it work in Awkward, but not much. Here's what happens now:
>>> np.real(ak.Array([[1+0.1j, 2+0.2j, 3+0.3j], [], [4+0.4j, 5+0.5j]]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/jpivarski/irishep/awkward/src/awkward/highlevel.py", line 1527, in __array_function__
return ak._connect.numpy.array_function(
File "/home/jpivarski/irishep/awkward/src/awkward/_connect/numpy.py", line 109, in array_function
rectilinear_args = tuple(_to_rectilinear(x, backend) for x in args)
File "/home/jpivarski/irishep/awkward/src/awkward/_connect/numpy.py", line 109, in <genexpr>
rectilinear_args = tuple(_to_rectilinear(x, backend) for x in args)
File "/home/jpivarski/irishep/awkward/src/awkward/_connect/numpy.py", line 78, in _to_rectilinear
return layout.to_backend(backend).to_backend_array(allow_missing=True)
File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 1021, in to_backend_array
return self._to_backend_array(allow_missing, backend)
File "/home/jpivarski/irishep/awkward/src/awkward/contents/listoffsetarray.py", line 2078, in _to_backend_array
return self.to_RegularArray()._to_backend_array(allow_missing, backend)
File "/home/jpivarski/irishep/awkward/src/awkward/contents/listoffsetarray.py", line 288, in to_RegularArray
self._backend.maybe_kernel_error(
File "/home/jpivarski/irishep/awkward/src/awkward/_backends/backend.py", line 67, in maybe_kernel_error
raise ValueError(self.format_kernel_error(error))
ValueError: cannot convert to RegularArray because subarray lengths are not regular (in compiled code: https://github.com/scikit-hep/awkward/blob/awkward-cpp-26/awkward-cpp/src/cpu-kernels/awkward_ListOffsetArray_toRegularArray.cpp#L22)
It's trying to auto-convert the Awkward Array into a NumPy array to use the function, and it can't because this one is ragged.
A workaround for np.real(a)
:
(a + np.conjugate(a)) / 2
and a workaround for np.imag(a)
:
(a - np.conjugate(a)) / 2j
The Array API spec says that these should have floating point type, which can be determined from the complex a.dtype
using
.astype(f"f{a.dtype.itemsize // 2}")
if it weren't for the
ComplexWarning: Casting complex values to real discards the imaginary part
That's not something that can be fixed with np.errstate...
(It's not necessary to self-assign! I'm just logging these as I go. :)
I found another one: np.round
is not a ufunc, but you'd think it is.
>>> np.round
<function round at 0x7e6a458334b0>
I'll have a work-around for this in just a moment.
Given a
,
a = np.array([1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9])
use np.modf, np.greater_equal, and np.add:
frac, whole = np.modf(a)
whole + (frac >= 0.5)
The Array API wants the output dtype to be the same as the input dtype, so it's fine that the result is floating-point. The new API also doesn't have a decimals
argument (which might be why it's not a ufunc).
This function needs special handling if the inputs are complex—it must be split into real and imaginary parts, the above procedure must be applied to those parts, and then the result is recombined into complex numbers.
NumPy's old API and its implementation of the new API both seem to be making a mistake for complex numbers:
>>> import numpy as np
>>> import numpy.array_api as xp
>>> np.round(np.asarray([1.1+0.1j, 2.2+0.2j, 5.5+0.5j, 6.6+0.6j]))
array([1.+0.j, 2.+0.j, 6.+0.j, 7.+1.j])
>>> xp.round(xp.asarray([1.1+0.1j, 2.2+0.2j, 5.5+0.5j, 6.6+0.6j]))
Array([1.+0.j, 2.+0.j, 6.+0.j, 7.+1.j], dtype=complex128)
I think that 5.5+0.5j
should round to 6+1j
, not 6+0j
. I'll report it.
Nope:
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.
(The Array API documentation should specify things like this!)
My original formulation is also wrong for negative numbers.
Got it: given a
>>> a = np.array([-2.5, -1.5, -0.6, -0.5, -0.4, 0.4, 0.5, 0.6, 1.5, 2.5])
>>> np.round(a)
array([-2., -2., -1., -0., -0., 0., 0., 1., 2., 2.])
>>> frac, whole = np.modf(a)
>>> abs_frac = np.absolute(frac)
>>> whole + ((abs_frac == 0.5) * (whole % 2 != 0) + (abs_frac > 0.5)) * np.sign(frac)
array([-2., -2., -1., -0., -0., 0., 0., 1., 2., 2.])
And then complex numbers are computed by applying this independently to the real and imaginary parts.
Edit: made another correction, so that this works for absolute-value fractions equal to 0.5 and not equal to 0.5.
https://github.com/data-apis/array-api/issues/726
For round, you can use np.around
Although np.around is not np.round
, (despite the np.around documentation saying that it's an alias), it's also not a ufunc:
>>> np.around
<function around at 0x763385fef6b0>
>>> isinstance(np.round, np.ufunc)
False
>>> isinstance(np.around, np.ufunc)
False
and so Awkward would need the same NEP-18 overload for np.around
as it does for np.round
. I'll add that to the title.
Ah now I remember - IIRC at one point I patched awkward to implement the round ufunc, but you could only overload around
, not round
. I don't know if that's still the case!
That's strange, since they're both not ufuncs. I would have thought that both of them would need to have a NEP-18 style overload (__array_function__
).
I added support for angle
along with real
and imag
in the branch 2917-npreal-npimag-npround-nparound-cant-take-an-awkward-array-as-input. I have not implemented real_if_close
yet, which has the same issues.
Looking into round
and around
. Indeed the documentation claims that round and around are aliases.
What is our policy for supporting numpy functions? real
and imag
(and abs
which is a ufunc) are part of the array API standard, while angle
is not. round
is, around
is not.
round
is related to ceil, fix, floor, rint, and trunc (according to the docstring for round). Of those, only round and fix are not ufuncs.
Should we support round
, but not around
or fix
? Do we draw a line somewhere between numpy operations that we do and do not intend to support? IMO, of these functions, floor and ceil are generally useful while the others are probably quite niche.
My view is that we can do this iteratively. It's definitely useful to have real
, imag
, and round
. With ceil
and floor
, we then have a fairly comprehensive set of float manipulation routines. I'd be happy to stop there, and revisit if we get a feature report.