Transformation inconsistency between on-disk and in-memory usage

Open quentinblampey opened this issue 1 year ago • 1 comments

Hi @LucaMarconato,

I discovered this weird edge case behavior which can lead to errors in the inverse transformations, but it only happens when we write the spatialdata object on disk.

TL;DR: when we save a 2D transformation on-disk, it is then read as a 3D transformation even if we don't want to (if a "z" column exists). Then, if we want to assign this transformation (supposedly 2D) to 2D shapes, then when writing the data again, the transformation will keep a 3D output but a 2D input. This doesn't happen in-memory because the transformation remains 2D.

Reproducing the issue:

import spatialdata
from spatialdata.transformations import Affine, set_transformation, get_transformation

sdata = spatialdata.datasets.blobs()

# I'm adding a dummy z-column
sdata["blobs_points"]["z"] = 1

# creating a dummy transformation (note that I'm not using z!)
affine = Affine([[0.2, 0, 100], [0, 0.2, 600], [0, 0, 1]], ["x", "y"], ["x", "y"])
set_transformation(sdata["blobs_points"], affine, "micron")

# now, I'm saving my sdata on disk for later use
sdata.write("test.zarr", overwrite=True)
sdata = spatialdata.read_zarr("test.zarr") # later on, I'm loading it back

# I'm loading back my sdata object, and I want to assign the same transformation to my circles
# Note that, since I saved sdata on disk, the affine transformation now contains the z-axis...
set_transformation(sdata["blobs_circles"], get_transformation(sdata["blobs_points"], get_all=True), set_all=True)

# saving again the object, for later use
sdata.write("test2.zarr")
sdata = spatialdata.read_zarr("test2.zarr") # later on, I'm loading it back

Now, the cells transformation is weird:

>>> sdata["blobs_circles"].attrs

{'transform': {'global': Identity ,
  'micron': Affine (x, y -> z, x, y)
      [0. 0. 0.]
      [  0.2   0.  100. ]
      [0.e+00 2.e-01 6.e+02]
      [0. 0. 1.]}}

For instance, it can't be inversed, and this will create an error:

sdata["blobs_circles"].attrs["transform"]["micron"].inverse()

Oct 29 '24 10:10 quentinblampey

Thanks for reporting. I plan to work on refactoring the coordinate transformations soon, so I will keep this into account. The extra dimension could be automatically safely discarded because the matrix has only zeros. For the moment I would call to_affine(input_axes=('x', 'y'), output_axes=('x', 'y')) manually to remove the ambiguity.

Jan 05 '25 22:01 LucaMarconato