dask-image icon indicating copy to clipboard operation
dask-image copied to clipboard

Performance testing `find_objects`

Open GenevieveBuckley opened this issue 3 years ago • 2 comments

The find_objects functionality is still quite new, and it would be good to get some performance testing done.

Some previous discussion is here https://github.com/dask/dask-image/pull/240#discussion_r675591009

Second, I think it's better to avoid using the scipy.ndimage.find_objects function directly. If you have an image chunk with just one object with a really high integer label n, the scipy find_objects result will return n - 1 values of None, and then the single meaningful result. That seems bad for parallized applications, so I think looping through only the unique integer values present in a given image chunk is a better way to go.

I've seen that scipy's find_objects uses a C implementation for speed and of course it'd be nice to avoid parallel implementations. How about calling the scipy function on a relabelled array to circumvent the problem you mention?

relabel_ar = np.zeros(len(unique_vals) + 1)
relabel_dict = dict() # dict for inverting relabelling afterwards
for il, l in enumerate(unique_dict):
    relabel_dict[il] = l
    relabel_ar[l] = il
x_relabelled = relabel_ar[x]

See also https://scikit-image.org/docs/dev/api/skimage.segmentation.html#relabel-sequential

And the reply:

We could potentially do that, as long as we kept track of the mapping between the old and new label integers.

Whether it's faster & worth it would depend on results from some performance testing. I'm inclined to get an implementation in, and then tinker with speed improvements (and anyone who'd like to jump in and try stuff is more than welcome!)

GenevieveBuckley avatar Dec 17 '21 08:12 GenevieveBuckley

I've also said it would be good to trial this on a dataset of a big-ish size. Most development was done with small, toy datasets.

https://github.com/dask/dask-image/pull/240#issuecomment-899160727

I'm hoping https://github.com/dask/dask/issues/7851 isn't going to be a problem here (might not be, but it's a good idea to try this on something of a decent size)

GenevieveBuckley avatar Dec 17 '21 08:12 GenevieveBuckley