dask-image icon indicating copy to clipboard operation
dask-image copied to clipboard

Make rotate available

Open martinschorb opened this issue 3 years ago • 5 comments

This adds the rotate function to ndimage.

I am struggling with the tests. https://github.com/dask/dask-image/issues/210

martinschorb avatar May 06 '21 10:05 martinschorb

cc @m-albert (in case this is of interest 🙂)

jakirkham avatar May 06 '21 16:05 jakirkham

@martinschorb Thanks a lot for sharing your dask implementation of rotate and this PR!

It's great that you added tests. I'm copying some of your comments from https://github.com/dask/dask-image/issues/210 here to address them in this thread.

I have some weird behaviour that I don't fully understand (this is my first time using pytest). when I do not pass output_chunks on to affine_transform the test run fine, except for the timeouts. I always get Unknown pytest.mark.timeout - is this a typo?. Something seems to be missing in my pytest configuration because the timeout tests run fine with your affine_transform.

I had a look at the code and think that the problem regarding the timeouts is the following: rotate in it's scipy implementation, as opposed to affine_transform, doesn't allow setting an output_shape. Therefore, in the test test_rotate_large_input_small_output_cpu it takes very long to create a large output array that is of the same size as the input array but with chunksize 1. In the analogous affine_transform test, the size of the output array is set to [1,1,1] and therefore there's no time spent on creating a large output array.

These two ways of going ahead would come to my mind:

  1. Stick to the scipy API and don't support differently shaped output arrays (and exclude the above test) or
  2. Include an output_shape argument to dask_image.ndinterp.rotate as this could be a useful functionality.

I'd tend to the second option, as anyways the dask_image API doesn't exactly mirror the scipy.ndimage API here. On the other hand, somebody requiring this functionality could simply use affine_transform with equivalent parameters.

when I pass output_chunks without any modifications, many tests fail: The reason for this: the images from ndimage and dask-image simply are not the same. There is a very small number of voxels that significantly differ. Why is that the case, when the only thing that differs is output_chunks being present?

I looked into the tests and found that the problem comes from affine_transform. Namely, the choice of transformation parameters in your test seem to require a slightly larger slice of the input array for a given output chunk. Despite testing using different matrices, this hadn't been caught by existing tests.

I fixed this behaviour here https://github.com/dask/dask-image/pull/216 and improved the affine_transform tests. If you include the small change the PR introduces, these tests should pass for you, too.

m-albert avatar May 07 '21 10:05 m-albert

Hi Marvin,

  • Stick to the scipy API and don't support differently shaped output arrays (and exclude the above test) or

  • Include an output_shape argument to dask_image.ndinterp.rotate as this could be a useful functionality.

The reason why there is no way of explicitly setting output_shape is that for rotate there are two main use cases. These are defined by the reshape parameter and it will set the output size to what is required to keep all the transformed image in the FOV.

Probably we could deal with both options and also enable output_shape (that would include checking the offset calculation). Then I would think of a good way of also testing for the reshape parameter.

What is the proper way of doing this? I change it in my repo and create a new PR?

Cheers

martinschorb avatar May 07 '21 10:05 martinschorb

Hi Martin,

The reason why there is no way of explicitly setting output_shape is that for rotate there are two main use cases. These are defined by the reshape parameter and it will set the output size to what is required to keep all the transformed image in the FOV.

True that reshape affects the output shape.

Probably we could deal with both options and also enable output_shape (that would include checking the offset calculation). Then I would think of a good way of also testing for the reshape parameter.

I agree that the best would probably be to have both arguments there and functional. Of course, conflicting parameters such as reshape=True together with a given output_shape would need to lead to an exception.

What is the proper way of doing this? I change it in my repo and create a new PR?

Any changes and commits to your branch martinschorb:main are automatically included here in the PR. Is this what you meant?

One thing I'm wondering about is whether the testing strategy is the most convenient like this. Given that a large part of the tests actually tests the functionality of affine_transform, maybe there'd be a way of reducing this overlap. This would come in handy especially as parallel PRs such as https://github.com/dask/dask-image/pull/215 are currently improving affine_transform and also significantly changing its tests. Hmm not sure yet what the best solution here would be.

Cheers!

m-albert avatar May 07 '21 11:05 m-albert

I am still getting slightly wrong results when comparing an image that is transformed with the output_shape set and the respective sub-area of a transformed image without. It looks like some interpolation is screwing up things...

from scipy import  misc
from dask_image.ndinterp import rotate
img = da.from_array(misc.ascent())
img = img[::4,::4]

im2 = rotate(img,angle)

output_shape = (9,9)
cx=img.shape[0]/2
cy=img.shape[1]/2
rx=output_shape[0]/2
ry=output_shape[1]/2

im3 = rotate(img,angle,output_shape = output_shape)
im4 = im2[cx-np.floor(rx):cx+np.ceil(rx),cy-np.floor(ry):cy+np.ceil(ry)]

these have clearly different values and it is not just a translation of pixels...

I'd love to have this checked by a test as well, but first need to understand where this difference comes from.

martinschorb avatar May 10 '21 15:05 martinschorb

Hey @martinschorb, I just found back our conversation here :)

It seems that back then I was a bit confused on what was going on with the output_shape parameter, I'm sorry for that!

m-albert avatar Mar 14 '23 17:03 m-albert

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

GPUtester avatar Mar 14 '23 17:03 GPUtester

these have clearly different values and it is not just a translation of pixels... I'd love to have this checked by a test as well, but first need to understand where this difference comes from.

I just ran your nice code example and realised that in the case of even values for output_shape, im3 and im4 are identical as expected. In the case of odd values, I think the rotated image is simply not directly comparable to im2 sliced by integer values? So I suspect your implementation of rotate actually behaves as expected.

m-albert avatar Mar 14 '23 18:03 m-albert

add to allowlist

GenevieveBuckley avatar Mar 15 '23 23:03 GenevieveBuckley

I've resolved some minor conflicts between this PR branch and the dask-image main branch.

GenevieveBuckley avatar Mar 15 '23 23:03 GenevieveBuckley

Thanks a lot @GenevieveBuckley!

I'll try to have a look at what's up with the failing tests in the next days.

m-albert avatar Mar 21 '23 09:03 m-albert

Hi all, thanks for this, it looks great, and I'm already using it in my workflow :)

I see that not passing the output_chunks parameter yields just one big chunk for the rotate operation even the input array was chunked (so no parallelization).

Any thoughts about adding:

    if output_chunks is None:
        output_chunks = input_arr.chunksize

gcaria avatar Oct 27 '23 10:10 gcaria

@gcaria I think that's a good idea! In the case of dask_image.ndinterp.affine_transform the shapes of the input and output images can strongly vary, and there's no reason to assume that input and output would have the same chunksizes. However, in the case of rotate the input and output stacks have similar sizes and assuming the input and output chunksizes to be the same seems reasonable to me.

@martinschorb How's your availability and willingness to continue working on this PR? I think we're not far from finishing a useful addition to dask-image and it'd be great to continue with it one way or another.

m-albert avatar Oct 27 '23 17:10 m-albert

@gcaria I think that's a good idea! In the case of dask_image.ndinterp.affine_transform the shapes of the input and output images can strongly vary, and there's no reason to assume that input and output would have the same chunksizes. However, in the case of rotate the input and output stacks have similar sizes and assuming the input and output chunksizes to be the same seems reasonable to me.

@martinschorb How's your availability and willingness to continue working on this PR? I think we're not far from finishing a useful addition to dask-image and it'd be great to continue with it one way or another.

Hi @m-albert , will look into it. Let's see how we can get this through...

martinschorb avatar Nov 06 '23 10:11 martinschorb

Now it needs 5 more tests to get everything covered. Very simple cases, working on it ...

martinschorb avatar Nov 06 '23 12:11 martinschorb

Great to see all the progress on this :) Just FYI I'm traveling this week and will be happy to give feedback beginning of next week.

m-albert avatar Nov 07 '23 13:11 m-albert

these details are still not perfect:

  • output parameter: I have implemented the dtype option here only. I don't fully understand how to place the result into an existing dask array. Also, the original implementation only works if the provided output array has the exact shape of the rotation result.
  • order and prefilter parameters are restricted. This is now explained in the docs.

martinschorb avatar Nov 08 '23 08:11 martinschorb

I have no idea what is the problem with the GPU CI...

https://gpuci.gpuopenanalytics.com/job/dask/job/dask-image/job/prb/job/dask-image-prb/169/CUDA_VER=11.5.2,LINUX_VER=ubuntu20.04,PYTHON_VER=3.10/console#L783

tests/test_dask_image/test_ndfilters/test_cupy_ndfilters.py::test_cupy_order[rank_filter-5-3] Fatal Python error: Aborted

This line/file was never touched in any of the commits.

martinschorb avatar Nov 08 '23 12:11 martinschorb

I wonder if that CI failure is related to the change in versions here: https://github.com/dask/dask-image/pull/345

In the commits there I don't see the gpuCI build check...

martinschorb avatar Nov 09 '23 08:11 martinschorb

Thanks for all your work here @martinschorb.

I have no idea what is the problem with the GPU CI... This line/file was never touched in any of the commits.

I think the GPU CI problems are unrelated to this PR and might relate to the intermittent failures observed by @GenevieveBuckley here. I tested this PR branch with cupy installed and all tests are passing locally for me. I think we can ignore this GPU CI failure in the context of this PR.

Regarding the code:

output parameter: I have implemented the dtype option here only. I don't fully understand how to place the result into an existing dask array. Also, the original implementation only works if the provided output array has the exact shape of the rotation result.

Supporting an output array into which the result is directly written is probably of limited use here and I'm not sure if it's even feasible. I think you're right in that determining an output dtype is related to this. However, since 1) conversion to a different dtype can be easily achieved by casting the output array and 2) to be consistent with dask_image.ndinterp.affine_transform I suggest to leave out this parameter completely.

Another parameter we can remove is output_shape. This way we're consistent with scipy.ndimage.rotate (and don't interfere with the reshape argument).

m-albert avatar Nov 13 '23 17:11 m-albert

@martinschorb, if you'd like I could work on my comments above. Would you agree to collaborate on this PR? If you agree I'd push related changes here directly and combine them with your work. We're almost there to get this through :)

m-albert avatar Feb 20 '24 15:02 m-albert

Yes, let's get this done.

martinschorb avatar Feb 21 '24 09:02 martinschorb

Alright, I took out the parameters discussed above and outsourced others to affine_transform to simplify testing and propagation of potential future changes.

All tests are passing (the failing doc test is unrelated to this PR) and in my view we're ready to go.

m-albert avatar Feb 21 '24 13:02 m-albert

Merging now 🎉

Thank you for your wonderful contribution to dask_image @martinschorb.

Also thank you to @gcaria for your feedback and a great suggestion regarding output chunks.

m-albert avatar Feb 21 '24 13:02 m-albert