array-api icon indicating copy to clipboard operation
array-api copied to clipboard

Mechanism to find complex dtype info

Open honno opened this issue 2 years ago • 15 comments

When introducing complex dtypes and updating/introducing related functions (#373), I wonder if we'd want to introduce a cinfo() function ala iinfo()/finfo(). What would its resulting info object look like?

I don't see any array/tensor library which implements exactly cinfo(), although there might be an equivalent I'm missing. For say NumPy, was there just not demand for such a function, and/or such a need was somewhat served with finfo()?

>>> np.finfo(np.complex64)
finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)
>>> assert np.finfo(np.complex64) == np.finfo(np.float32)

honno avatar May 16 '22 18:05 honno

Sorry if this is a bit off topic, but I wonder if instead of having different functions like this, we could have just one function that was type agnostic like tinfo or similar. This would save needing to check type and call the right one when determining things like the range of values or similar.

jakirkham avatar May 16 '22 19:05 jakirkham

Sorry if this is a bit off topic, but I wonder if instead of having different functions like this, we could have just one function that was type agnostic like tinfo or similar. This would save needing to check type and call the right one when determining things like the range of values or similar.

My impression is this would be awkward as float and complex dtypes have additional information to ints, so if you don't want polymorphic returns from a universal function (previous unique discussions suggest a hard no), you might want a function (or separate functions) dedicated for bounds and bit-size. Personally whenever I've wanted to know bounds/size for code that deals with both ints and floats, I'm often writing additional logic for float scenarios (NaN branches :upside_down_face:), so would be checking dtype family anyway.

honno avatar May 17 '22 17:05 honno

Here's the results of an integer and floating point type from NumPy (line 1 was the import):

In [2]: np.iinfo(np.int32)
Out[2]: iinfo(min=-2147483648, max=2147483647, dtype=int32)

In [3]: np.finfo(np.float32)
Out[3]: finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)

Nearly all of the information is shared other than resolution. In fact resolution could even be specified for integers for simplicity (as it would just be 1).

Anyways it's unclear why having one function would be challenging. Though please feel free to clarify

jakirkham avatar May 17 '22 17:05 jakirkham

Nearly all of the information is shared other than resolution. In fact resolution could even be specified for integers for simplicity (as it would just be 1).

Ah resolution=1 seems like a clean solution, although with the spec it's called eps (still feels correct, just not as semantically nice).

Note the np.finfo() repr hides a lot of additional info NumPy finds, although most of that we can ignore. smallest_normal however is an attribute that both np.finfo() and xp.finfo() share—is there a nice solution for smallest_normal here?

honno avatar May 17 '22 17:05 honno

It's confusing, John, they have quite different attributes: https://data-apis.org/array-api/latest/API_specification/generated/signatures.data_type_functions.iinfo.html https://data-apis.org/array-api/latest/API_specification/generated/signatures.data_type_functions.finfo.html

leofang avatar May 17 '22 17:05 leofang

smallest_normal could also be 1 for integers

We could also make these things None if we prefer that

jakirkham avatar May 17 '22 18:05 jakirkham

Leo, those show the same attributes as the example above. All they add is bits (which they both have) and smallest_normal (which we are now discussing).

jakirkham avatar May 17 '22 18:05 jakirkham

OK perhaps I should've linked to the corresponding NumPy pages? My point is the returned attributes could increase and further diverge in the future, just like the status quo in NumPy. I feel this unifying discussion would make our life harder when it comes to extensibility in the future.

btw, to clarify:

Ah resolution=1 seems like a clean solution, although with the spec it's called eps (still feels correct, just not as semantically nice).

eps and resolution have different meanings in NumPy.

leofang avatar May 18 '22 03:05 leofang

Not clear to me what info would belong on cinfo. For example, what should eps be for complex numbers? What would min and max be given that complex numbers don't have a natural ordering? Same for smallest_normal?

kgryte avatar Jun 02 '22 16:06 kgryte

I'd keep at least bits and eps for cinfo, but definitely exclude min and max because we don't wanna follow NumPy to define some awkward comparison/sort capabilities over complex-valued arrays (IIRC it's discussed long time ago in one of the threads).

leofang avatar Jun 06 '22 20:06 leofang

How about we have bound-y attributes per component, i.e. {real/imag}_{min/max}, {real/imag}_eps and {real/imag}_smallest_normal? Or just attributes for both the real and imag components (which should be the same), i.e. component_{min/max}, component_eps and component_smallest_normal.

Both of those options do feel weird, but having an array library tell you the bounds can be pretty useful.

honno avatar Jun 23 '22 17:06 honno

Based on feedback in the most recent consortium meeting (2022-06-23), the plan is to open an issue on the NumPy issue tracker to gauge appetite there before moving forward adding cinfo to the specification.

The main argument for adding cinfo to the standard is consistency and thoroughness. However, hard to justify adding to spec atm without having any real-world array library implementations. We'd like to see an array library add a cinfo object first, hence opening an issue on NumPy proposing the addition.

kgryte avatar Jun 23 '22 18:06 kgryte

Think we forgot to delegate someone to do this, so just now wrote an issue at https://github.com/numpy/numpy/issues/22260

honno avatar Sep 14 '22 17:09 honno

How do folks feel about changing xp.finfo() so it supports complex dtypes? https://github.com/numpy/numpy/issues/22260#issuecomment-1247387732 I think makes a sound argument

cinfo() does not sound very useful to me. Complex types are a composite type, composed of two instances of a real type. When I try to think of actual use-cases for cinfo(), I end up wanting the information of the underlying real type. So I think the current behavior of finfo() is reasonable, and makes cinfo() unnecessary. Of course, this behavior of finfo() should be documented.

https://github.com/numpy/numpy/pull/22263 has indeed now added documented complex support in numpy.finfo().

honno avatar Sep 19 '22 05:09 honno

@honno I suggest adding your proposal to the next Array API meeting agenda.

kgryte avatar Sep 19 '22 05:09 kgryte

@honno is there anything left to do here after gh-484?

rgommers avatar Oct 07 '22 10:10 rgommers

@honno is there anything left to do here after gh-484?

Nope—forgot to link this issue to that merged PR, so I'll close this.

honno avatar Oct 07 '22 10:10 honno