MONAI
MONAI copied to clipboard
Spacingd transform possibly resampling incorrectly
Describe the bug I want to resample 3D microscopy data to be able to pass (almost) isotropic data through a 3D U-Net. I get data resampled along the wrong axes (to my understanding) and by the wrong factors.
To Reproduce This highly simplified example shows the (to me) unexpected behavior:
# Parameters
in_pixdim = [1.0, 0.25, 0.25]
out_pixdim = [0.25, 0.25, 0.25]
# Create data
image = torch.zeros(in_tensor_size, dtype=torch.float32)
image = MetaTensor(image, meta={'pixdim': in_pixdim})
label = torch.zeros(in_tensor_size, dtype=torch.int32)
label = MetaTensor(label, meta={'pixdim': in_pixdim})
data = {"image": image, "label": label}
# Check shapes and metadata (is space = 'RAS' correct for microscopy?)
image.shape, image.meta, label.shape, label.meta
(torch.Size([20, 100, 100]),
{'pixdim': [1.0, 0.25, 0.25],
affine: tensor([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]], dtype=torch.float64),
space: RAS},
torch.Size([20, 100, 100]),
{'pixdim': [1.0, 0.25, 0.25],
affine: tensor([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]], dtype=torch.float64),
space: RAS})
# Set up transforms
sp_d = Spacingd(keys=["image", "label"], pixdim=out_pixdim, mode=("bilinear", "nearest"),)
sp = Spacing(pixdim=out_pixdim, mode="bilinear",)
# Apply transforms
sp_data = sp_d(data)
sp_image = sp(image)
# Check outputs
sp_data["image"].shape, sp_data["image"].meta, sp_data["label"].shape, sp_data["label"].meta
(torch.Size([20, **397, 397**]),
{'pixdim': **[1.0, 0.25, 0.25]**,
affine: tensor([[0.2500, 0.0000, 0.0000, 0.0000],
[0.0000, 0.2500, 0.0000, 0.0000],
[0.0000, 0.0000, 1.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 1.0000]], dtype=torch.float64),
space: RAS},
torch.Size([20, **397, 397**]),
{'pixdim': **[1.0, 0.25, 0.25]**,
affine: tensor([[0.2500, 0.0000, 0.0000, 0.0000],
[0.0000, 0.2500, 0.0000, 0.0000],
[0.0000, 0.0000, 1.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 1.0000]], dtype=torch.float64),
space: RAS})
sp_image.shape, sp_image.meta
(torch.Size([20, **397, 397**]),
{'pixdim': **[1.0, 0.25, 0.25]**,
affine: tensor([[0.2500, 0.0000, 0.0000, 0.0000],
[0.0000, 0.2500, 0.0000, 0.0000],
[0.0000, 0.0000, 1.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 1.0000]], dtype=torch.float64),
space: RAS})
# Try with another pair of in_pixdim and out_pixdim
in_pixdim = [1.0, 0.25, 0.25]
out_pixdim = [1.0, 0.25, 0.25] # No change
# ... same code to transform the data ...
# Check outputs
sp_data["image"].shape, sp_data["image"].meta, sp_data["label"].shape, sp_data["label"].meta
(torch.Size([20, 100, **397**]),
{'pixdim': **[1.0, 0.25, 0.25]**,
affine: tensor([[1.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.2500, 0.0000, 0.0000],
[0.0000, 0.0000, 1.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 1.0000]], dtype=torch.float64),
space: RAS},
torch.Size([20, 100, **397**]),
{'pixdim': **[1.0, 0.25, 0.25]**,
affine: tensor([[1.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.2500, 0.0000, 0.0000],
[0.0000, 0.0000, 1.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 1.0000]], dtype=torch.float64),
space: RAS})
sp_image.shape, sp_image.meta
(torch.Size([20, 100, **397**]),
{'pixdim': **[1.0, 0.25, 0.25]**,
affine: tensor([[1.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.2500, 0.0000, 0.0000],
[0.0000, 0.0000, 1.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 1.0000]], dtype=torch.float64),
space: RAS})
Expected behavior
In the first case, the 'pixdim'
metadata value of the upscaled data should be [0.25, 0.25, 0.25]
instead of [1.0, 0.25, 0.25]
and the shape should be [80, 100, 100]
instead of [20, 397, 397]
.
In the second case, the 'pixdim'
metadata is correct (I guess this never gets changed), but the shape should be [20, 100, 100]
(as the input) instead of [20, 100, 397]
. Why did the x dimension get upscaled?
Environment
Please see below.
Ensuring you use the relevant python executable, please paste the output of:
python -c "import monai; monai.config.print_debug_info()"
================================ Printing MONAI config...
MONAI version: 1.3.0
Numpy version: 1.26.3
Pytorch version: 2.2.0+cu121
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 865972f7a791bf7b42efbcd87c8402bd865b329e
MONAI file: /home/
Optional dependencies: Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION. ITK version: NOT INSTALLED or UNKNOWN VERSION. Nibabel version: 5.2.0 scikit-image version: 0.22.0 scipy version: 1.12.0 Pillow version: 10.2.0 Tensorboard version: 2.15.1 gdown version: NOT INSTALLED or UNKNOWN VERSION. TorchVision version: 0.17.0+cu121 tqdm version: 4.66.1 lmdb version: NOT INSTALLED or UNKNOWN VERSION. psutil version: 5.9.8 pandas version: 2.2.0 einops version: NOT INSTALLED or UNKNOWN VERSION. transformers version: NOT INSTALLED or UNKNOWN VERSION. mlflow version: NOT INSTALLED or UNKNOWN VERSION. pynrrd version: 1.0.0 clearml version: NOT INSTALLED or UNKNOWN VERSION.
For details about installing the optional dependencies, please visit: https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
================================ Printing system config...
System: Linux Linux version: Ubuntu 23.10 Platform: Linux-6.5.0-25-generic-x86_64-with-glibc2.38 Processor: x86_64 Machine: x86_64 Python version: 3.11.5 Process name: python Command: ['python', '-c', 'import monai; monai.config.print_debug_info()'] Open files: [] Num physical CPUs: 4 Num logical CPUs: 8 Num usable CPUs: 8 CPU usage (%): [16.6, 11.4, 24.2, 45.3, 14.4, 14.0, 12.3, 64.2] CPU freq. (MHz): 3075 Load avg. in last 1, 5, 15 mins (%): [11.7, 7.3, 7.3] Disk usage (%): 81.8 Avg. sensor temp. (Celsius): UNKNOWN for given OS Total physical memory (GB): 125.5 Available memory (GB): 105.9 Used memory (GB): 17.2
================================ Printing GPU config...
Num GPUs: 1 Has CUDA: True CUDA version: 12.1 cuDNN enabled: True NVIDIA_TF32_OVERRIDE: None TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None cuDNN version: 8902 Current device: 0 Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90'] GPU 0 Name: Quadro RTX 5000 GPU 0 Is integrated: False GPU 0 Is multi GPU board: False GPU 0 Multi processor count: 48 GPU 0 Total memory (GB): 15.7 GPU 0 CUDA capability (maj.min): 7.5
Hi @aarpon, thanks for posting this. I'll take a quick look and get back to you
Hi @aarpon, I've gone over your example, I think I understand the problem.
There are two fixes that you need to make to the example:
- Add a channel dimension to
image
andlabel
- Assign a homogeneous matrix containing the pixel dims to
affine
, instead of assigning"pix_dim"
to the MetaTensor.
Add a channel dimension MONAI transforms expect spatial data to have a channel dimension followed by the spatial dimensions. Your tensors in this example should have a shape of (1, 20, 100, 100).
Assign to affine
instead of 'pix_dim'
The pixdim
parameter on MetaTensor gets its values by picking the appropriate values out of MetaTensor.affine
rather than a pix_dim
field in the metadata. You generate the affine for a given set of pixel dimensions as follows:
p = (1.0, 0.25, 0.25)
affine = torch.tensor(
[
[p[0], 0.0, 0.0, 0.0],
[0.0, p[1], 0.0, 0.0],
[0.0, 0.0, p[2], 0.0].
[0.0, 0.0, 0.0, 1.0]
]
)
You can then assign it to your MetaTensor as follows:
image = MetaTensor(image, affine=affine)
This should produce the expected output. Please let me know if this solves your problem or whether there is still an issue.
Thanks, Ben
Thanks, @atbenmurray! Your suggestion indeed seems to work fine.
Just one curiosity: if I start again with the same tensor with shape (1, 20, 100, 100)
and affine.diag() = [1.0, 0.25, 0.25, 1.0]
and:
- I resample it with
Spacingd(..., pixdim=[0.25, 0.25, 0.25], ...)
, I obtain a tensor with shape(1, 77, 100, 100)
instead of the expected(1, 80, 100, 100)
; - I reshape it with
Spacingd(..., pixdim=[1.0, 1.0, 1.0], ...)
, I obtain a tensor with shape(1, 20, 26, 26)
instead of the expected(1, 20, 25, 25)
.
Is this just some rounding error somewhere?
Thanks again! a2
@aarpon my apologies for the slow response; I didn't have much keyboard time yesterday.
There is indeed an issue with Spacing
/Spacingd
in this regard. If it is possible for you to do, I recommend making use of Resize
/Resized
instead if it is practical for you to do so.
If all of your samples are the same size, you can simply replace Spacing
with Resize
, passing in the desired size. If each of your samples is a different size, you might have to create a wrapper function/class that creates a Resize object for each call. Please let me know if you need more advice about how to do this.
Hi, @atbenmurray
no worries, thanks for following up. I ultimately ended up writing my own CustomResampler
and CustomResamplerd
that directly use torch.nn.functional.interpolate
since, in addition, I was struggling with the fact that during the training the affine matrix of each of my MetaTransform
s in the batch would go from shape (4, 4)
to shape (batch, 4, 4)
, and this would cause Spacingd
to fail because it expected a 2D affine matrix, but was getting a 3D one. I don't know if this has to do with the fact that my DataLoader
s are using list_data_collate
as collate_fn
, but I couldn't figure out what was causing this and ended up writing my own transforms. Any idea what the idea with the 3D affine matrix could be due to?
@aarpon, sorry for leaving this dangling. Can you give me a minimal code sample that replicates the behaviour?
Sorry for the late reply. I switch to my own resampler long time ago. In any case, I think the issue was that after Spacingd()
, I was using RandCropByPosNegLabeld()
and this -- again, not sure -- was adding a batch dimension to the affine matrix.