Coordinate transformations refactoring
The new coordinate transformations system is explained in this talk https://youtu.be/7HzMn_ooJmk?t=1391 from minute 23:11 to 31:45.
Pragmatically, it requires the following:
Data model
- [ ]
elements.attrs['coordinate_system'](or maybe with the shorter name'cs'), will be added; this will be a single 'string' - [ ]
elements.attrs['transform'] will be removed; the information on coordinate transformation will be moved to theSpatialData` level; transformations will be defined between coordinate systems, not between elements and coordinate systems, coordinate systems and elements, nor elements and elements - [ ] the above changes require raster types to use
xarraycoordinates - [ ] there will be a helper function to convert back and forth between
BaseTransformation(and so alsoNGFFBaseTransformation) andxarraycoordinates. How coordinates are converted needs to be agreed on (discussion here https://github.com/scverse/spatialdata/issues/165#issuecomment-1939260813). - [ ] on disk,
xarraycoordinates will not be saved (this is not supported by NGFF); instead the above conversion will be used
APIs
- [ ] there will be a helper function to transfer the coordinate transformations metadata easily between
SpatialDataobjects. In fact, elements will not contain coordinate transformations anymore. - [ ]
SpatialData.transform_element_to_coordinate_system()will be incorporated intoSpatialData.transform_to_coordinate_system(), by the use of extra arguments. Also it will become a wrapper aroundtransform(). - [ ]
transform(data, transformation, to_coordinate_system)will have the following behavior- [ ] if
transformationis passed,to_coordinate_systemmust beNone. This will modify the data in-place, and not involve/modify any coordinate system transformation. It's ok to call this function if there is only one coordinate system, but if multiple coordinate systems are being used, this usage will probably be undesirable, so we should raise a warning. - [ ] if
transformationisNone,to_coordinate_systemmust be provided. This will transform the data to the desired coordinate system (=change the xarray coordinates, e.g. translations/scale; and also change the data when required, e.g. rotation); also the coordinate system metadata for the element will be updated.
- [ ] if
- [ ] The parameter
maintain_positioningmust be removed, because now we can't modify one element and compensate the transformations departing from it with the inverse transformations. In fact, now the transformations will involve the whole coordinate system. We can consider adding an helper function that mimics the current behavior ofmaintain_positioning
File format
- [ ] The NGFF transformation specification is not merged yet; check if we have to update the io functions to the new draft specs (or if it is merged to main by that time, implement the final version of the specs).
Release
- [ ] We need to add deprecation notices prior to release; this refactoring will introduce some breaking changes.
CC @melonora
- [ ]
get_centroids()(https://github.com/scverse/spatialdata/pull/465) needs to be adjusted when the xarray coordinates are not starting from 0.5 (or 0, after https://github.com/scverse/spatialdata/issues/165 is solved), but are implicitly taking into account for the translation.
I am not sure if deepcopy() for SpatialImages will preserve the xarray coordinates, my guess it's that it will reset the xyz coordiantes (the c are passed explicitly); multiscale spatial image currently doesn't call the parser so it should be fine.
- [ ] we should check this after this is implemented.
The NGFF transformations specification is going to be merged within a few months! While reviewing the respective RFC, I will list in this comment all the points that needs to be addressed from the spatialdata side to ensure full compliance. These points could be addressed as part of a PR for this issue, or a follow-up/pre PR.
coordinate systems
- [ ] add validation code to ensure that there are no duplicate coordinate systems; two coordinate systems are unique if they have the same name and same axes. Two coordinate systems with the same name and different axes should not exist and should trigger an exception/warning.
axes
- [ ] The unit is not saved.
- [ ] The unit should be validated, only the names approved in the specification should be allowed ('micrometer', 'millimiter', ...)
- [ ] The axis type should be parsed and be read-write; we won't use it in the code
intrinsic coordinate system
- [ ] the name of the coordinate system is going to be the element path; still not clear from the specification if it is going to be a relative or absolute path. Probably relative. This could create some problems in the implementation if we move transformations around, we need to check this.
- [ ] the axis names for the intrinsic coordinate systems should be
dim_i, as inxarray. - [ ] the
arrayCoordinateSystemfield is not implemented for array types is not implemented.
pixels
- [ ] the pixel center should be
(0, 0), and the pixel rectangle should be the half-open interval[-0.5, 0.5) x [-0.5, 0.5).
coordinate transformations: missing implementations
Implementations are required only to work with Identity, Translation and Scale, so we do not need to implement the following. Still, listing all the points here for completeness.
- [ ] inverseOf is not implemented
- [ ] rotation not implemented
- [ ] displacements not implemented
- [ ] coordinates not implemented
- [ ] bijection not implemented
- [ ] byDimension implemented in the
NgffTransformationsclasses, but not in thespatialdata.transformationsclasses
coordinate transformations: main major points
The points in this paragraph require some deep changes that maybe it's worth to make directly in a package upstream rather than in the spatialdata.
- [ ] in the specification, the input and output coordinate systems of identity, translation, scale, rotation do not need to be identical. In spatialdata they must.
- [ ] in spatialdata we restrict to axes that are named after
{'c', 'x', 'y', 'z'}. Also we treat the'c'differently from'x','y','z'. For instance we can't define a translation involving'c'. - [ ] the NGFF validation of
mapAxistransformations is a bit more strict, I don't remember exactly the difference (please look at the code)
coordinate transformations: other major points
- [ ] ransformations between different images MUST be stored in the attributes of a parent zarr group.
- [ ] Luca: this means transformations between two extrinsic coordinate systems, but I think if it is allowed to have transformations between two intrinsic coordinate systems as well (i.e. between coordinate systems named after NGFF paths). This has an implication: if the array is replaced, the transformation may become inconsistent, the user should be careful.
coordinate transformations: other minor details
- [ ] coordinate transformations may have the field "name", we currently do not parse it.
- [ ] "name" is unique across transformations (within the same object I think)
- [ ] translation, scale, affine, rotation do not support the storage of the vector on disk; currently we expect the data to be always available in the json
- [ ] For transformations that store data or parameters in a zarr array those zarr arrays SHOULD be stored in a zarr group
"coordinateTransformations". - [ ] This is relevant for coordinates and displacements: if an operation is requested that requires the inverse of a transformation that can not be inverted in closed-form, implementations MAY estimate an inverse, or MAY output a warning that the requested operation is unsupported.
- [ ] Values in the
scalearray SHOULD be non-zero. We could raise a warning when we detect a zero (it would be a not very useful transformation, and not invertible).
coordinate transformation, verify that we are compliant
- [ ] sequence transformations do not require input and output, verify that we can handle this case
- [ ] inverseOf transformations do not require input and output, verify that we can handle this case
- [ ]
Coordinate transformations from array to physical coordinates MUST be stored in multiscales - [ ]
and MUST be duplicated in the attributes of the zarr array - [ ] Transformations in the
transformationslist of abyDimensionstransformation MUST provideinputandoutputas arrays of strings corresponding to axis names of the parent transformation's input and output coordinate systems (see below for details). - [ ] in NGFF, the affine matrices are nested but without the last row of
[0, 0, ..., 0, 1]. In memory we store also that row. We should:- [ ] verify that we write the correct matrix on disk
- [ ] enrich the error that the user gets if the last row is omitted, so that it's clear that they need to add it.
- [ ] I made some comments on the requirements of the byDimension constrains for the axes. It's better to double check if something changes in the specs. Currently we are compliant.
- [x] Implementations MAY choose to communicate if and when an image can be displayed in multiple coordinate systems. Users might choose between different options, or software could choose a default (e.g. the first listed coordinate system).
things that need to be clarified
- [ ] it's not entirely clear to me how multiscale transformations will be encoded, see discussion here: https://github.com/ome/ngff/pull/255#issuecomment-2329497720.
- [x] can an
Identityhave input and output axes that are different (assuming they have the same dim)? Answer: yes.