Allow libraries to negotiate that they understand e.g. NumPy arrays?
Today dask came up a few times and also the possibility of a fallback to __array_function__ for transitioning at least.
This must have come up before, but did we ever discuss an API like:
class myarray:
def __get_namespace__(self, types):
for t in types:
if t is myarray:
continue
if t is np.ndarray: # we know about ndarray!
continue
return NotImplemeneted
return mynamespace
which is not really important if you think about numpy, jax, torch, tensorflow: I don't think you should mix them really! But it seems more becomes interesting if we start talking about dask or quantities which are designed as wrappers?
The answer can be that this is not our problem have to coerce manually. In the NumPy world, we could actually just write get_namespace as:
@array_function_dispatched(lambda *args: args)
def get_namespace(*args):
return numpy
(and assume others will return their namespace, since in practice everyone has it de-facto)
Doesn't this require n-way multiple dispatch in general? How does array_function_dispatched handle that?
(As a correction from the above, accepting NumPy and maybe CuPy arrays can be interesting also for JAX/torch/...)
It came up again in SciPy now, so going to list some references:
- Warren asks about it here: https://github.com/scipy/scipy/issues/18286#issuecomment-1506015370
- This case study mentions it: https://github.com/data-apis/array-api/issues/403
- Here (and maybe other posts) Stefan asks about it: https://github.com/numpy/numpy/issues/21135#issuecomment-1068491952
- This one also has it: https://autoray.readthedocs.io/en/latest/autoapi/autoray/index.html#autoray.infer_backend_multi (IMO the particular choice there is dubious or at least cannot be generalized. But the point is that it was likely use-case driven; Something that is still lacking here since we have no adoption as of now.)
- Of course
__array_function__, NEP 37, ... all provision for it (Maybelike=should also accept a tuple and be less strict in general, but that would be easy to add). - (I am sure there are more. I have not found a discussion about the current choice.)
IMO it is a mistake not to provision for this and there is a reason everyone seems to have some (basic) solution. Yes, I also think that this should be very restrictive/conservative (but that mainly a documentation thing in most schemes).
Of course what is actually important is the downstream need/ask, because array-api should adapt to those needs. From my side, I think there is enough evidence. We can say its not yet enough, but that can clearly change at this point in time.
There's several questions one could ask here:
- Is it important for adoption/usability?
- Should mixing array libraries, some kind of hierarchy or dispatching be supported by the standard itself?
- Or should it be supported in
array_api_compat? - Or is it
numpy-specific, and should it therefore be defined bynumpyonly?
Doesn't this require n-way multiple dispatch in general? How does
array_function_dispatchedhandle that?
In general, yes. It's pretty well explained in this section of NEP 18. Basically " subclasses before superclasses, and otherwise left to right", and then let each handler decide whether it can deal with foreign array objects or if it wants to return NotImplemented.
I thought we decided that this kind of usage is UB (https://github.com/data-apis/array-api/issues/399) but it might actually be good to at least revisit and reconfirm this. Another use case is pint+dask+cupy+xarray: https://github.com/pydata/xarray/issues/7721