scipy icon indicating copy to clipboard operation
scipy copied to clipboard

Feature-Request: axis-wise interpolation of 2D arrays

Open lzkelley opened this issue 9 years ago • 9 comments

I'm often in a situation when I have a large set of 1D arrays which I'd like to interpolate (see for example this post). The best (only?) solution I know of is to loop over each array and interpolate each independently. Would it be feasible to extend a simple interpolation method (e.g. scipy.interp --> numpy.interp) to work of 2D arrays accepting an axis argument?

lzkelley avatar May 24 '16 13:05 lzkelley

Will interp1d do what you want, or are your arrays not all sampled the same way (as in the cited example)?

larsoner avatar May 24 '16 14:05 larsoner

Unfortunately no, they're sampled completely differently. (That's my post on stackexchange; should have said that).

lzkelley avatar May 24 '16 14:05 lzkelley

I'm not sure if making a scipy function would save much overhead. It seems like whatever was done under the hood, even if it's in Cython, will have to dig into Python objects (or a 1D numpy array of Python objects), which is where your overhead already comes in.

Can you tell where the bottleneck is?

larsoner avatar May 24 '16 14:05 larsoner

I'm not sure where it is (or how to check?). If it's just a linear interpolation, however, isn't there a linear algebra approach that might generalize?

lzkelley avatar May 24 '16 17:05 lzkelley

I believe that a linear interpolation for multidimensional x (i.e different x series for different y series) is not difficult to implement on scipy's side.

There is a method interp1d._call_linear, some chances that it will even work for multidimensional x without any modification.

nmayorov avatar May 25 '16 09:05 nmayorov

Interesting! Thanks @nmayorov, I'll try playing around with that a bit

lzkelley avatar May 25 '16 13:05 lzkelley

@lzkelley Have you had any luck with this? I have a 3-d array (time x latitude x longitude , 355x195x192) that I am currently looping through all latitude/longitude indices and computing interpolations along the time axis. Unfortunately this process is taking about 2 hours since my data is pretty large. Im looking for a way to avoid looping through this all. NOTE: Each lat/lon point is sampled differently. You can view my scenario more in depth here: http://stackoverflow.com/questions/43183718/interpolating-a-3d-array-in-python-expanded

ajoros avatar Apr 05 '17 00:04 ajoros

I think I just ran into this same problem and posted a question on stack overflow before finding this issue.

It boils down to doing something like this as fast as possible:

np.array([np.interp(x, X[i], Y[i]) for i in range(len(X))])

I agree that this would be a very useful enhancement to scipy!

ahwillia avatar Apr 08 '18 19:04 ahwillia

As Nikolay said above and linked SO answers demonstrate, this is doable in specific cases with not too much difficulty using numpy broadcasting. What are the memory/speed tradeoffs needs to be measured I guess. In principle, we may be open to considering a specialized implementation of interp1d._call_linear with x.ndim > 1 to do an equivalent of https://github.com/scipy/scipy/issues/6187#issuecomment-379577212 --- if someone is willing to champion it. I rather strongly suspect it's easier to do on a user side, but can be convinced otherwise (by a PR adding this to interp1d and showing a decent speed-up).

Backwards compat will need to be carefully preserved, and experience shows that touching anything in interp1d by anything shorter than a barge pole does break something somewhere for somebody.

ev-br avatar Aug 09 '22 22:08 ev-br