dask-image icon indicating copy to clipboard operation
dask-image copied to clipboard

examples?

Open ebo opened this issue 6 years ago • 11 comments

I'm trying to work through the functionality. Do you have any examples available outside of the test harness?

ebo avatar Jan 10 '19 19:01 ebo

Thanks @ebo. Sorry for being somewhat slow here.

We make use of these in our work, but I suspect that probably won't be helpful as an example. What sort of examples are you looking for?

Should add I'm meeting up with a colleague to do some work on dask-image. One of the things we had discussed was improving the docs. So this would be very much in scope for that work. So if there are specific things you would like to see, please let us know. ;)

cc @jni

jakirkham avatar Jan 20 '19 00:01 jakirkham

Yep! Excited to work on this. We start on the 31st so get your ideas in before then! =)

jni avatar Jan 20 '19 01:01 jni

You start what on the 31'st? Think nothing about the delay.

I've been looking at image labeling/segmentation, and the cluster measurements. Having some simple examples which show how this is done would save some time. I have started from the unit tests.

As for docs and examples, I will think about this, but I think some basic examples showing the implemented functionality would be nice.

Anyway, I am working on porting some image segmentation algorithms. I've started by first porting several of the algs to work with numba to wrap my head around some of the previously implemented ones. Then port to dask.

ebo avatar Jan 20 '19 01:01 ebo

We are doing a sprint on dask-image starting on the 31st.

Cool, would be curious to hear your thoughts on labeling. There has been some discussion in issue ( https://github.com/dask/dask-image/issues/29 ).

Phrased a different way, are there certain kinds of problems that you were looking to solve when you looked at dask-image (e.g. how do I perform Gaussian smoothing over a large array?).

jakirkham avatar Jan 20 '19 04:01 jakirkham

The big stuff is all shuttered with the government shutdown and the changes in the contracts, so I have no idea what I can participate in between now and the end of the sprint. Also, the stuff I was trying to do 6 months ago is almost no longer funded. That said I have a very small amount of work (1/10'th FTE) carrying me through the end of 2020, but that is not enough to allow me to work on this as part of my work.
I will only be able to volunteer a little outside of work -- due to a LOT of restrictions. That said, I would like to chat with you off-line to see if I can participate in the sprint.

There are something like 200+ segmentation algorithms in the published literature. So if there is some way to abstract the operations of the algorithms to do things like Gaussian smoothing, labeling, and classification over 2.5 gigapixel dask arrays on VM's as small as 3GB would allow me to compare dask and hand written code. Showing me how to apply arbitrary transforms like Gaussian smoothing, Haar, Canny, and others. The biggest problem I have had with using dask is that it works WONDERFULLY for everything already implemented, and there is not enough docs to figure out how to implement something new that I need.

On Jan 19 2019 9:01 PM, jakirkham wrote:

We are doing a sprint on dask-image starting on the 31st.

Cool, would be curious to hear your thoughts on labeling. There has been some discussion in issue ( https://github.com/dask/dask-image/issues/29 ).

Phrased a different way, are there certain kinds of problems that you were looking to solve when you looked at dask-image (e.g. how do I perform Gaussian smoothing over a large array?).

ebo avatar Jan 20 '19 22:01 ebo

I just made a fork of dask-image and will will add some stuff for review.

As a note, I have about 4 different efforts, and I am not sure what all is working and not at the moment (sorry this is a relatively low priority at the moment). That said I have a couple of things that may be of interest:

  1. I have ported watershed_ift from scipy.ndimage.measurement into dask-image.measurement. I am not sure that this was the best source base for the watershed algorithm to port over, BUT as I recall I got it basically working but not fully tested.

  2. I have a version of scikit-image.morphology.watershed.pyx that I ported from cython to numba. This appears to be a better starting point, and with the numba work it should work a lot faster, BUT it has not been daskified.

I will add both of these to new branches to my fork so that we can share/plaw with them, but my work on using numba begs the question on if we should also look at numbifying the basic algorythms to get the speed way up. Suggestions? Will people also be looking at numba integration?

Hope this helps...

EBo --

On Jan 19 2019 9:01 PM, jakirkham wrote:

We are doing a sprint on dask-image starting on the 31st.

Cool, would be curious to hear your thoughts on labeling. There has been some discussion in issue ( https://github.com/dask/dask-image/issues/29 ).

Phrased a different way, are there certain kinds of problems that you were looking to solve when you looked at dask-image (e.g. how do I perform Gaussian smoothing over a large array?).

ebo avatar Jan 30 '19 20:01 ebo

I have not seen anything on the sprint. I guess I was not plugged in appropriately. Any news on what's happening?

How long will it be continuing, and is there a list of things to work on?

I posted a hack on the watershed algorithm and another with numba. Is any of that of interest?

As a note, I am spinning up on a new project and have limited time at the moment. That said I will try to spend a day or two helping as I may.

On Jan 19 2019 9:01 PM, jakirkham wrote:

We are doing a sprint on dask-image starting on the 31st.

Cool, would be curious to hear your thoughts on labeling. There has been some discussion in issue ( https://github.com/dask/dask-image/issues/29 ).

Phrased a different way, are there certain kinds of problems that you were looking to solve when you looked at dask-image (e.g. how do I perform Gaussian smoothing over a large array?).

ebo avatar Feb 06 '19 06:02 ebo

@ebo I am interested in a parallel watershed. Do you have a (even partially written) dask-based solution?

I currently use skimage.segmentation.watershed with map_blocks, just to get a baseline. I shed small objects, but they are dense enough to hit the chunk edges so I'd like to think about better solutions.

chrisroat avatar Jun 04 '20 07:06 chrisroat

Hi @chrisroat, @ebo has some watershed related work here that you might like to look at. As you can tell by the discussion there's still a lot of unsolved stuff to work through, but perhaps you'll still find it interesting to read.

GenevieveBuckley avatar Jun 04 '20 08:06 GenevieveBuckley

The last I worked on it the new code was not ready yet. The very old code that was working was written for a project that not only would I have to go through an extensive review to get permission to release.
Even though I had planned to pick it up during the shelter-in-place, there are at least 5 remaining projects that have to get done first (but most of them are short)... What is your timeline on your project?

On Jun 4 2020 1:31 AM, Chris Roat wrote:

@ebo I am interested in a parallel watershed. Do you have a (even partially written) dask-based solution?

I currently use skimage.segmentation.watershed with map_blocks, just to get a baseline. I shed small objects, but they are dense enough to hit the chunk edges so I'd like to think about better solutions.

ebo avatar Jun 05 '20 05:06 ebo

Our project is ongoing, and we add improvements as we get them. No rush at all. I'll check out that other discussion -- there might be something there that works well enough for our data.

chrisroat avatar Jun 07 '20 23:06 chrisroat