array-api Mechanism to find complex dtype info

When introducing complex dtypes and updating/introducing related functions (#373), I wonder if we'd want to introduce a cinfo() function ala iinfo()/finfo(). What would its resulting info object look like?

I don't see any array/tensor library which implements exactly cinfo(), although there might be an equivalent I'm missing. For say NumPy, was there just not demand for such a function, and/or such a need was somewhat served with finfo()?

>>> np.finfo(np.complex64)
finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)
>>> assert np.finfo(np.complex64) == np.finfo(np.float32)

May 16 '22 18:05 honno

Sorry if this is a bit off topic, but I wonder if instead of having different functions like this, we could have just one function that was type agnostic like tinfo or similar. This would save needing to check type and call the right one when determining things like the range of values or similar.

May 16 '22 19:05 jakirkham

Sorry if this is a bit off topic, but I wonder if instead of having different functions like this, we could have just one function that was type agnostic like tinfo or similar. This would save needing to check type and call the right one when determining things like the range of values or similar.

My impression is this would be awkward as float and complex dtypes have additional information to ints, so if you don't want polymorphic returns from a universal function (previous unique discussions suggest a hard no), you might want a function (or separate functions) dedicated for bounds and bit-size. Personally whenever I've wanted to know bounds/size for code that deals with both ints and floats, I'm often writing additional logic for float scenarios (NaN branches :upside_down_face:), so would be checking dtype family anyway.

May 17 '22 17:05 honno

Here's the results of an integer and floating point type from NumPy (line 1 was the import):

In [2]: np.iinfo(np.int32)
Out[2]: iinfo(min=-2147483648, max=2147483647, dtype=int32)

In [3]: np.finfo(np.float32)
Out[3]: finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)

Nearly all of the information is shared other than resolution. In fact resolution could even be specified for integers for simplicity (as it would just be 1).

Anyways it's unclear why having one function would be challenging. Though please feel free to clarify

May 17 '22 17:05 jakirkham

Nearly all of the information is shared other than resolution. In fact resolution could even be specified for integers for simplicity (as it would just be 1).

Ah resolution=1 seems like a clean solution, although with the spec it's called eps (still feels correct, just not as semantically nice).

Note the np.finfo() repr hides a lot of additional info NumPy finds, although most of that we can ignore. smallest_normal however is an attribute that both np.finfo() and xp.finfo() share—is there a nice solution for smallest_normal here?

May 17 '22 17:05 honno

It's confusing, John, they have quite different attributes: https://data-apis.org/array-api/latest/API_specification/generated/signatures.data_type_functions.iinfo.html https://data-apis.org/array-api/latest/API_specification/generated/signatures.data_type_functions.finfo.html

May 17 '22 17:05 leofang

smallest_normal could also be 1 for integers

We could also make these things None if we prefer that

May 17 '22 18:05 jakirkham

Leo, those show the same attributes as the example above. All they add is bits (which they both have) and smallest_normal (which we are now discussing).

May 17 '22 18:05 jakirkham

OK perhaps I should've linked to the corresponding NumPy pages? My point is the returned attributes could increase and further diverge in the future, just like the status quo in NumPy. I feel this unifying discussion would make our life harder when it comes to extensibility in the future.

btw, to clarify:

Ah resolution=1 seems like a clean solution, although with the spec it's called eps (still feels correct, just not as semantically nice).

eps and resolution have different meanings in NumPy.

May 18 '22 03:05 leofang

Not clear to me what info would belong on cinfo. For example, what should eps be for complex numbers? What would min and max be given that complex numbers don't have a natural ordering? Same for smallest_normal?

Jun 02 '22 16:06 kgryte

I'd keep at least bits and eps for cinfo, but definitely exclude min and max because we don't wanna follow NumPy to define some awkward comparison/sort capabilities over complex-valued arrays (IIRC it's discussed long time ago in one of the threads).

Jun 06 '22 20:06 leofang

How about we have bound-y attributes per component, i.e. {real/imag}_{min/max}, {real/imag}_eps and {real/imag}_smallest_normal? Or just attributes for both the real and imag components (which should be the same), i.e. component_{min/max}, component_eps and component_smallest_normal.

Both of those options do feel weird, but having an array library tell you the bounds can be pretty useful.

Jun 23 '22 17:06 honno

Based on feedback in the most recent consortium meeting (2022-06-23), the plan is to open an issue on the NumPy issue tracker to gauge appetite there before moving forward adding cinfo to the specification.

The main argument for adding cinfo to the standard is consistency and thoroughness. However, hard to justify adding to spec atm without having any real-world array library implementations. We'd like to see an array library add a cinfo object first, hence opening an issue on NumPy proposing the addition.

Jun 23 '22 18:06 kgryte

Think we forgot to delegate someone to do this, so just now wrote an issue at https://github.com/numpy/numpy/issues/22260

Sep 14 '22 17:09 honno

How do folks feel about changing xp.finfo() so it supports complex dtypes? https://github.com/numpy/numpy/issues/22260#issuecomment-1247387732 I think makes a sound argument

cinfo() does not sound very useful to me. Complex types are a composite type, composed of two instances of a real type. When I try to think of actual use-cases for cinfo(), I end up wanting the information of the underlying real type. So I think the current behavior of finfo() is reasonable, and makes cinfo() unnecessary. Of course, this behavior of finfo() should be documented.

https://github.com/numpy/numpy/pull/22263 has indeed now added documented complex support in numpy.finfo().

Sep 19 '22 05:09 honno

@honno I suggest adding your proposal to the next Array API meeting agenda.

Sep 19 '22 05:09 kgryte

@honno is there anything left to do here after gh-484?

Oct 07 '22 10:10 rgommers

@honno is there anything left to do here after gh-484?

Nope—forgot to link this issue to that merged PR, so I'll close this.

Oct 07 '22 10:10 honno

array-api array-api copied to clipboard

Mechanism to find complex dtype info

array-api
array-api copied to clipboard