scipy
scipy copied to clipboard
Feature-Request: axis-wise interpolation of 2D arrays
I'm often in a situation when I have a large set of 1D arrays which I'd like to interpolate (see for example this post). The best (only?) solution I know of is to loop over each array and interpolate each independently. Would it be feasible to extend a simple interpolation method (e.g. scipy.interp --> numpy.interp) to work of 2D arrays accepting an axis argument?
Will interp1d do what you want, or are your arrays not all sampled the same way (as in the cited example)?
Unfortunately no, they're sampled completely differently. (That's my post on stackexchange; should have said that).
I'm not sure if making a scipy function would save much overhead. It seems like whatever was done under the hood, even if it's in Cython, will have to dig into Python objects (or a 1D numpy array of Python objects), which is where your overhead already comes in.
Can you tell where the bottleneck is?
I'm not sure where it is (or how to check?). If it's just a linear interpolation, however, isn't there a linear algebra approach that might generalize?
I believe that a linear interpolation for multidimensional x (i.e different x series for different y series) is not difficult to implement on scipy's side.
There is a method interp1d._call_linear, some chances that it will even work for multidimensional x without any modification.
Interesting! Thanks @nmayorov, I'll try playing around with that a bit
@lzkelley Have you had any luck with this? I have a 3-d array (time x latitude x longitude , 355x195x192) that I am currently looping through all latitude/longitude indices and computing interpolations along the time axis. Unfortunately this process is taking about 2 hours since my data is pretty large. Im looking for a way to avoid looping through this all. NOTE: Each lat/lon point is sampled differently. You can view my scenario more in depth here: http://stackoverflow.com/questions/43183718/interpolating-a-3d-array-in-python-expanded
I think I just ran into this same problem and posted a question on stack overflow before finding this issue.
It boils down to doing something like this as fast as possible:
np.array([np.interp(x, X[i], Y[i]) for i in range(len(X))])
I agree that this would be a very useful enhancement to scipy!
As Nikolay said above and linked SO answers demonstrate, this is doable in specific cases with not too much difficulty using numpy broadcasting. What are the memory/speed tradeoffs needs to be measured I guess. In principle, we may be open to considering a specialized implementation of interp1d._call_linear with x.ndim > 1 to do an equivalent of https://github.com/scipy/scipy/issues/6187#issuecomment-379577212 --- if someone is willing to champion it. I rather strongly suspect it's easier to do on a user side, but can be convinced otherwise (by a PR adding this to interp1d and showing a decent speed-up).
Backwards compat will need to be carefully preserved, and experience shows that touching anything in interp1d by anything shorter than a barge pole does break something somewhere for somebody.