MONAI
MONAI copied to clipboard
Consolidation of spatial transform parameters
Introduction
Before we hit 1.0, it may be worth looking across the set of transforms that we have and making parameters consistent where we can do so. Below is a list of transforms (columns) and their construction parameters.
In general:
- Any parameters that we add can be done so without issue
- Any renamed parameters should be kept for 1.0 and the old ones deprecated for future removal
What is still missing
This table considers the __init__
methods of the various spatial transforms. There are also some inconsistencies across the __call__
methods, in terms of what can be modified on a per call basis, which can also potentially be added.
See also
#5010 as adding / consolidating parameters is closely tied to the idea of unifying the back-end resample operation across spatial transforms
Table details
The table is split up into transform specific parameters and general parameters that apply to many or all spatial transforms.
Entries are as follows:
-
i
: parameter is on__init__
only -
c
: parameter is on__call__
only -
x
: parameter is on both__init__
and__call__
-
?
: parameter is not present but perhaps should be
S O F R R Z R R R R R R R A R A R R R R G R
p r l e o o o a a a a a e f a f a a a a r a
a i i s t o t n n n n n s f n f n n n n i n
c e p i a m a d d d d d a i d i d d d d d d
i n z t t R R F A Z m n A n A D 2 3 D G
n t e e e o o l x o p e f e f e D D i r
g a 9 t t i i o l f G f f E E s i
t 0 a a p s m e i r i o l l t d
i t t F n i n r a a o D
o e e l e d e m s s r i
n 9 i G G t t t s
0 p r r i i i t
i i c c o o
d d n r
t
i
o
n
don't transform images x x x
------------------------------------------------------------------------
pixdim i
output_spatial_shape c
diagonal i
axcodes i
as_closest_canonical i
labels i
spatial_axis i i
size_mode i
angle i
zoom i
k i
spatial_axes i i
max_k i
range_x i
range_y i
range_z i
min_zoom i
max_zoom i
rotate_params i i
shear_params i i
translate_params i i
scale_params i i
rotate_range i i i i
shear_range i i i i
translate_range i i i i
scale_range i i i i
spacing i i ?
magnitude_range i i i
sigma_range ? i
num_cells i i
distort_steps x
distort_limit i
mode x x x x ? ? x x x x x . . . x x x x
padding_mode x x x ? ? x x x x x . . . x x x x
align_corners x x x x x x ? ? ? ? ? ? ?
keep_size i i ? ? i x ? ? ? ? ? ? ?
spatial_size i c c c c c x x
anti_aliasing x ? ? ? ? ? ? ?
anti_aliasing_sigma x ? ? ? ? ? ? ?
dtype x x x x i i ? ?
affine i i
as_tensor_output i i i i i i i i
normalized i ?
norm_coords x i ?
device x i i i i i i i i
cache_grid i
grid c c c c
randomize c c c c c c c c c
prob i i i i i ? i i i
General parameters
keep_size:
keep_size
should be added to any transform that has the potential to alter the image
size:
- Rotate90, RandRotate90, Resample, Affine, RandAffine, Rand2DElastic, Rand3DElastic, RandDistortion, RandGridDistortion
align_corners:
align_corners
should be added to any transform that has the potential to alter the image
size:
- Resample, Affine, RandAffine, Rand2DElastic, Rand3DElastic, RandDistortion, RandGridDistortion
mode:
mode
should be added to any transform with the potential to have to resample off pixel / voxel
centers. Rotate90
and RandRotate90
both have this potential when dealing with anisotropic
pixels / voxels
- Rotate90, RandRotate90
-
Note: Making Rotate90 and RandRotate90 perform resampling changes the current behavior
It could be argued that, although these methods generate grids rather than preprocessing images,
mode
could also be defined for them as it might then be subsequently passed to lazy resampling once that feature is implemented: - AffineGrid, RandAffineGrid, RandDeformGrid
padding_mode:
padding_mode
should be added to any transform that has the potential to sample outside of the
existing volume
- Rotate90, RandRotate90
- Note: Making Rotate90 and RandRotate90 perform resampling changes the current behavior
anti_aliasing / anti_aliasing_sigma:
anti_aliasing
/ anti_aliasing_sigma
should be applicable to transforms that support mode
,
particularly if The resampling backend method is consolidated across most spatial transforms.
- Spacing, Zoom, RandZoom, Resample, Affine, RandAffine, GridDistortion, RandGridDistortion
- See also Rand2DElastic, Rand3DElastic, the former of which appears to be missing
sigma_range
norm_coords:
- deprecated on Affine but not Resample
Specific parameters
There are a number of transforms with slight naming inconsistencies in their transform-specific parameter names:
- Flip and RandFlip use
spatial_axis
, whereas Rotate90 and RandRotate90 usespatial_axes
- Flip and RandFlip could move to
spatial_axes
, as multiple axes can be specified
- Flip and RandFlip could move to
- Rotate uses
angle
whereas RandRotate usesrange_x
,range_y
andrange_z
- Although RandRotate does need to take ranges instead of scalars, they could still be
handled in the way
angle
is handled, depending on the dimensionality of the data vs the number of ranges specified
- Although RandRotate does need to take ranges instead of scalars, they could still be
handled in the way
-
anti_aliasing_sigma
vssigma_range
: Resize hasanti_aliasing_sigma
(and a few other transforms should probably have it if we enable this kind of resampling in general), but Rand3DElastic hassigma_range
(and Rand2DElastic should probably have it). This should probably becomeanti_aliasing_sigma_range
Omitted:
- SpatialResample
- ResampleToMatch
- GridSplit
- GridPatch
- RandGridPatch
thanks, @atbenmurray this looks nice, I can help fix some of them in the next couple of weeks. just to note that - anti-aliasing prefiltering is only needed when downsampling, and rotate90/randrotate90 may not require interpolation...
thanks, @atbenmurray this looks nice, I can help fix some of them in the next couple of weeks. just to note that - anti-aliasing prefiltering is only needed when downsampling, and rotate90/randrotate90 may not require interpolation.
For some of this, I'm thinking in terms of how we prepare for lazy resampling and whether we can sort out the APIs now so that they need minimal further changes. I'll finish populating the initial feature request today.
EDIT: still to tidy this up.
@ericspod @Nic-Ma @rijobro @wyli What do you think? It would be nice (and achievable) to get parameters consolidated before 1.0, but having gone through this exercise, I wonder how much value there really is:
- additions can be done whenever without issue
- parameter name replacements are a relatively small in number; doing them for 1.0 at least sets the api and allows us to warn 1.0 users that the deprecations and pending removals are already in place What do you think?
I think any effort we can do to harmonise our transforms would be beneficial. It's trivial refactoring, but I imagine it would take a while to do them all. If we were to do it, it would ideally be in 1.0 so that users get used to these parameters, but might be cutting it too fine to the deadline?
Really useful table by the way!
thanks, @atbenmurray this looks nice, I can help fix some of them in the next couple of weeks. just to note that - anti-aliasing prefiltering is only needed when downsampling, and rotate90/randrotate90 may not require interpolation...
Sure, but I think that if you can parameterise a transform such that the operation can sometimes benefit from those parameters, they should be a part of the parameter list for the transform. We can always document under what circumstances using the parameter will or will not have an effect
I think this is overall good but first reducing the number of transforms that we have by condensing similar ones would help. How we do lazy resampling later will have to take into account the sort of operations that we expect transforms to do, for example Rotate90
I wouldn't have a mode
parameter for because I would explicitly expect the transform to be a matrix operation. I don't want this transform to resample data but just reshape/flip/transpose/etc. data around, this would be critical to certain sorts that resampling would impact. Our lazy evaluation mode is going to have to operate in stages where it composes contiguous sequences of affine, integer affine (eg. flip, transpose, rotate90), and non-affine transforms together but applies them in separate stages.
I think this is overall good but first reducing the number of transforms that we have by condensing similar ones would help. How we do lazy resampling later will have to take into account the sort of operations that we expect transforms to do, for example
Rotate90
I wouldn't have amode
parameter for because I would explicitly expect the transform to be a matrix operation. I don't want this transform to resample data but just reshape/flip/transpose/etc. data around, this would be critical to certain sorts that resampling would impact. Our lazy evaluation mode is going to have to operate in stages where it composes contiguous sequences of affine, integer affine (eg. flip, transpose, rotate90), and non-affine transforms together but applies them in separate stages.
Certainly. All is up for discussion. I was talking with @rijobro on Tuesday about this and he pointed out that Rotate90
doesn't take pixdim
into account, which means that volumes with anisotropic volumes can be deformed by Rotate90
. I did the experiment and affine
is changed by the rotation, but pixdim
is not, meaning that anisotropic data doesn't need a resample but at the cost of something quite unintuitive.
Perhaps the solution here is to add a flag to Rotate90
that means it takes pixdim
into account and does the more expensive operation.
Of course, I'm seeing all of this from a lazy resampling perspective, but for me Rotate90
is unexpected behavior in its current form, albeit driven by a desire for performance.
On Rotate90 I think we were talking at slight cross purposes, @atbenmurray. Rotate90 doesn't resample, but it does update the metadata (see here). Using Rotate90 doesn't deform images in real world space, but since networks work in image space, you're right that there would be an apparent deformation in this space. I don't think this is a bug though, I think it is to be expected. If a user wants their network to be ambivalent to 90 degree rotations, then the user should probably resample to isotropic spacing.
- [x] review backends update docstrings
- [ ] review arg names
spatial_axis
,spatial_axes
- [ ] keep size=True Rotate90 RandRotate90
- [x]
self.__call__
additional args from__init__
method - [x] Move internal transforms constructed during
__call__
to__init__
- [ ] review arg names: angle vs range_x, range_y, range_z
@ericspod @mingxin-zheng @Nic-Ma @rijobro @wyli I'm going to pick up the following:
-
self.__call__
additional args from init method - Move internal transforms constructed during
__call__
to__init__
Unless anyone objects or is already working on it
I'm working on:
- review backends update docstrings (https://github.com/Project-MONAI/MONAI/pull/5126)
@wyli @ericspod @rijobro It looks like the majority of the proposed changes for adding in call time parameters and keep_size have too much impact on the code base in terms of potential breakage (particularly around inverses). I've had a chat with @ericspod and @rijobro and we think it is too close to release to be able to do properly. @wyli do you object or is that ok?
sure, thanks for letting me know, perhaps we move this ticket to a discussion now? and create separate ones to track the remaining tasks in the backlog.