spatialdata icon indicating copy to clipboard operation
spatialdata copied to clipboard

Adding `.pipe` to `SpatialData`

Open srivarra opened this issue 1 year ago • 4 comments

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

It'd be elegant to be able to chain functions on a SpatialData object. Currently given some functions f,g,h

def f(sdata: sd.SpatialData, *) -> sd.SpatialData: ...
def g(sdata: sd.SpatialData, arg1: Any, arg2: Any, *) -> sd.SpatialData: ...
def h(arg3: Any, sdata: sd.SpatialData, *) -> sd.SpatialData: ...

We would have to do the following:

sdata_h = h(arg3=c, sdata=g(f(sdata), arg1=a, arg2=b))

# or

sdata_h = h(sdata)
sdata_g = g(sdata_f, arg1=a, arg2=b)
sdata_f = f(arg3=c, sdata=sdata_h)

Describe the solution you'd like

Pandas and Xarray have pipe methods for DataFrames, DataArrays and Datasets, looking over their examples the pipe here would be able to be used like so:

sdata.pipe(f, arg1=a).pipe(g, arg2=b).pipe((h, "sdata"), arg3=c)

Describe alternatives you've considered

If a user has their own custom SpatialData Accessors for f,g,h (where h's first argument is a SpatialData / self object in this case), then it should work just the same, but incorporating the accessor call within a lambda function makes it rather wordy.


sdata = (
    sdata.pipe(lambda s: s.my_accessor.f())
    .pipe(lambda s: s.my_accessor.g(arg1=a, arg2=b))
    .pipe(lambda s: s.my_accessor.h(arg3=c))
)

Just chaining the accessor is much easier to read in this instance.

sdata.my_accessor.f().myaccessor.g(arg1=a, arg2=b).my_accessor.h(arg3=c)

For accessors, piping would be more useful in contexts where there are higher order functions composed of calls to the accessor's methods:

def f(sdata: sd.SpatialData, arg1, arg2) -> sd.SpatialData:
	intermediate_sdata = sdata.my_accessor.h(arg1).my_accessor.g(arg2)
	something_has_been_done = do_something_else(intermediate_sdata)
	return something_has_been_done

def i(sdata: sd.SpatialData, arg3) -> sd.SpatialData:
	intermediate_sdata = sdata.my_accessor.h(arg3)
	something_has_been_done2 = do_something_else2(intermediate_sdata)
	return something_has_been_done2

modified_sdata = sdata.pipe(f, arg1=a, arg2=b).pipe(i, arg3=c)

Additional context

Implementation: The following has been taken from https://github.com/pydata/xarray/blob/d33e4ad9407591cc7287973b0f8da47cae396004/xarray/core/common.py#L717-L847

P = ParamSpec("P")
T = TypeVar("T")

class SpatialData:
    ...
    def pipe(self, func: Callable[P, T] | tuple[Callable[P, T], str], *args: P.args, **kwargs: P.kwargs) -> Any:
        if isinstance(func, tuple):
            func, target = func
            if target in kwargs:
                raise ValueError(f"{target} is both the pipe target and a keyword argument")
            kwargs[target] = self
            return func(*args, **kwargs)
        else:
            return func(self, *args, **kwargs)

These pipes can return anything so users would have to keep that in mind if they plan on chaining multiple calls to pipe.

References:

srivarra avatar Aug 30 '24 00:08 srivarra

hi @srivarra , this sounds a very interesting feature! what could be a use case for this at the moment?

giovp avatar Sep 02 '24 17:09 giovp

@giovp Currently, this would be useful for some of the pipelines I've created with SpatialData objects in my analysis project. Multiple functions are called sequentially on the same object. It's not a super important feature or use case, it's more of a convenience utility.

srivarra avatar Sep 04 '24 17:09 srivarra

Sounds very interesting @srivarra , I personally don't have a lot of capacity for new features atm, but if you feel like submitting a PR, would be very happy to support!

giovp avatar Sep 04 '24 17:09 giovp

@srivarra Would you be willing to help implement this? If so we could schedule a meeting and work on it together if you would like.

melonora avatar Sep 23 '24 16:09 melonora

@melonora Yeah I'd be willing to implement this, was on vacation for a while, but I'll add a PR soonish.

srivarra avatar Sep 30 '24 16:09 srivarra

Thanks @srivarra, the feature looks very interesting! Happy to review the PR 😊

LucaMarconato avatar Sep 30 '24 18:09 LucaMarconato