heat
heat copied to clipboard
Implement `interp`
Feature functionality https://numpy.org/doc/stable/reference/generated/numpy.interp.html
Additional context Climate Use case @kleinert-f
Stil open and particularly of interest since related to a particular use-case. Also possible connection to #1183 possible.
Reviewed within #1109
Hey @ClaudiaComito and @mrfh92, for this should we use np.interp
or create a custom implementation for interp
?
@ClaudiaComito May I work on this issue😋? Just to confirm, is this file where the function needs to be implemented?
@samadpls sry for the late answer, I was in vacation.
I think it would be best to add the function to the file heat/heat/core/arithmetics.py
.
Regarding your first question, the function ht.interp
should exhibit the same behaviour (and API) as numpy.interp
. However, since local arrays in Heat are PyTorch-tensors and we need to cover the (memory-)distributed case, we cannot make direct use of numpy.interp
and need to write our own custom implementation; in particular, numpy.interp
could not be used on GPUs (which is a highly important case for Heat) at all... However, to get our implementation as fast as possible, one should high-level routines from PyTorch (at a first look, I guess that torch.lerp
could be interesting) wherever possible because these implementations are highly optimized in the single-process context. In a very condensed way: we want to "glue together" highly optimized single-process PyTorch-routines using mpi4py (wrapped in heat.communication
) to obtain a (memory-)distributed routine for our overall DNDarray.
Don't hesitate to ask, if you need more infos or help :)
Branch features/832-Implement_interp created!
hey @mrfh92, While looking for possible implementations, I came across this implementation in the PyTorch community, which can provide valuable insights into the interpolation process and strategies: PyTorch Interpolation Implementation-1 & PyTorch Interpolation Implementation-2. This approach might allows us to take advantage of PyTorch's optimized operations while addressing the distributed nature of our data.