scikit-image
scikit-image copied to clipboard
Pipelines in scikit-image?
Would it be interesting to have a tool for building pipelines in scikit-image, which would allow to chain filters? scikit-learn has a pipeline tool in order to chain estimators (https://scikit-learn.org/stable/modules/compose.html)
The principal advantage would be for users to have simpler code, if for example only a few parameters of the underlying functions are exposed, with the other ones having fixed parameters. Another advantage is when using meta-functions such as our apply_parallel
, you could call it on the pipelined function instead of each filter (and hence have lazy evaluation of intermediate filters I think?). Caching of intermediate results with memoization would be possible too, for faster execution when changing parameters.
The disadvantage is that we should always be wary of adding new code, and I would like to be sure whether having a pipeline mechanism solves a real problem for users, or not.
Anyway, I don't have clear thoughts about this yet but I open this issue as a placeholder for discussion.
Do you think something like dask.delayed
could fulfill a similar function? That way, a pipeline can be defined as a Python function, which should be a bit more readable. Maybe you then lose some of the features you mentioned, I'm not sure?
Thank you @stefanv dask.delayed
might be worth exploring here, yes. I'm not sure if memoization would be possible thought, I'll think about it.
I think it does have some support for memoization, something like .persist
iirc. @jakirkham might be able to chime in?
Is there still interest in this? I am looking for composable solutions for creating ground-truth segmentations in some cellular imaging data and scikit-image is the basis for where I'm starting. Pipelines like used in sklearn were my first thought which is how i ended up here.
For now, I think the best way to chain operations is to use Python.
Unless you have very well defined APIs for functions in the pipeline (e.g. sklearn classifieds), a pipeline mechanism doesn't save you that much typing. But, happy if someone could point out advantages I missed!