array-api icon indicating copy to clipboard operation
array-api copied to clipboard

RFC: allow scalars and 0D arrays in `concat`

Open ev-br opened this issue 7 months ago • 11 comments

The following patterns are quite common [1]: x = np.r_(x[0], x, x[-1]) and x = np.r_[0, x]. Neither of these can be directly replaced by xp.concat because the latter requires that The arrays must have the same shape, except in the dimension specified by axis. The most common case IME is that x is a 1D array, which gets appended or prepended by a scalar.

An Array API replacement is something along the lines of

def npr(xp, *arys):
    arys = [xp.asarray(a) for a in arys]
    arys = [xpx.atleast_nd(a, ndim=1, xp=xp) for a in arys]
    return xp.concat(arys)

which requires array_api_extra and is generally a bit clunky. There was at least one case where a scipy change which was missing atleast_1d broke jax.scipy.

Allowing 0D arrays and python scalars in concat would obviate the need for these sorts of helpers.

[1] At least in scipy,

$ git grep "np.r_"  |wc -l
169

ev-br avatar May 18 '25 21:05 ev-br

Why does this function not allow broadcasting? It seems like a more general solution isn't it?

vnmabus avatar May 19 '25 07:05 vnmabus

The current guidance originates from NumPy (see https://numpy.org/doc/1.26/reference/generated/numpy.concatenate.html).

The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).

kgryte avatar May 19 '25 09:05 kgryte

Yes, numpy's concatenate is limited by what is in this guidance. Numpy, however, has np.r_, which --- with all is sins --- allows extending arrays with scalars or 0D arrays. And this is what's missing in the array API land.

ev-br avatar May 19 '25 10:05 ev-br

I think np.concat is not limited by this guidance, it was always limited (check np.concatenate in older versions). For NumPy, it's hstack that does the right thing here (perhaps by accident).

>>> import numpy as np
>>> x = np.arange(5)
>>> np.hstack((x[0], x, x[-1]))
array([0, 0, 1, 2, 3, 4, 4])

>>> # concat is more fiddly:
>>> np.concat((x[0], x, x[-1]))
...
ValueError: zero-dimensional arrays cannot be concatenated

>>> np.concat((np.expand_dims(x[0], 0), x, x[-1]))
...
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 2 has 0 dimension(s)

>>> np.concat((np.expand_dims(x[0], 0), x, np.expand_dims(x[-1], 0)))
array([0, 0, 1, 2, 3, 4, 4])

I'd say just go with the last line here. More verbose, but definitely an improvement over r_ (all the *_ constructors are unreadable).


This led me to rediscover gh-494, may be worth revisiting perhaps.

rgommers avatar May 19 '25 10:05 rgommers

np.concat is not limited by this guidance, it was always limited (check np.concatenate in older versions)

This is exactly what I am saying: np.concatenate was always limited, and np.r_ was a way around the limitation.

definitely an improvement over r_ (all the *_ constructors are unreadable).

TBH, I fail to see how xp.concat((xp.expand_dims(x[0], 0), x)) is more readable than np.r_[x[0], x]. And it does not of course work for np.r_[0, x], which needs something like xp.concat((xp.zeros_like(x), x)) instead.

gh-494, may be worth revisiting perhaps.

This is almost xpx.atleast_nd.

ev-br avatar May 19 '25 11:05 ev-br

TBH, I fail to see how xp.concat((xp.expand_dims(x[0], 0), x)) is more readable

Perhaps write a little helper function for SciPy then?

def concat_1d(*arrays, *, xp):
    """Like `concat`, except (a) for 1-D only, and (b) also accepts scalars and 0-D arrays"""

Then you can write it as concat_1d(x[0], x, x[-1], xp=xp), which is about as good as it gets until there's a function in the standard that does this.

rgommers avatar May 19 '25 12:05 rgommers

Exactly. I've a scipy helper, and this issue is to gauge interest/possibility to make it work with xp.concat (or xp.stack) and drop the helper.

ev-br avatar May 19 '25 14:05 ev-br

FWIW, I have no strong opinion either way. Although, if you allow 0-D the question is why not allow any broadcasting (except along the concatenated dimension) and it may be nice to have a NumPy PR to see what others think. I could also see to allow optional broadcasting.

What has come up in NumPy before (I think Matt Haberland for example liked to have it), is a broadcast_arrays(*arrs, omit_axis=...). Doesn't quite make this particular use-case nice, but has some overlap (and makes writing the helper clean for N-D inputs).

seberg avatar May 29 '25 16:05 seberg

Are there any upstream numpy issues about adding 0-D or broadcasting support to np.concatenate?

asmeurer avatar May 29 '25 17:05 asmeurer

https://github.com/numpy/numpy/issues/28549 but I am not aware of old discussions (which coesn't mean they don't exist).

seberg avatar May 29 '25 17:05 seberg

The stackoverflow references on that issue certainly seem to imply that full broadcasting would be useful, not just support for 0-D concatenation.

asmeurer avatar May 29 '25 17:05 asmeurer