ngff icon indicating copy to clipboard operation
ngff copied to clipboard

Define the origin w.r.t. the pixel corner or center

Open constantinpape opened this issue 2 years ago • 10 comments

The image specification (multiscales) must define the origin, either w.r.t. pixel center or corner.

@lassoan summarized this well in https://github.com/ome/ngff/pull/85#issuecomment-1026030204:

In 2D applications using one of the pixel corners as origin is probably more common, but in 3D imaging voxel coordinate system origin is typically the center. In VTK, ITK, and all applications based on these toolkits origin is in the pixel center. In all 3D image file formats that I know of (nrrd, nifti, metaio, dicom) origin is in the pixel center, too. NGFF standard must specify this (if it hasn't been specified it already). If any software that uses different coordinate system convention internally then it can convert to/from the standard coordinate system when reading/writing images.

So I think we should use the pixel center as origin here as well. Anyone wants to make a small PR to add this, @jbms @d-v-b @lassoan @thewtex? Otherwise I can give this a shot once #85 is merged.

constantinpape avatar Feb 01 '22 10:02 constantinpape

So I think we should use the pixel center as origin here as well.

Yes, agreed.

bogovicj avatar Feb 01 '22 13:02 bogovicj

For reference, here is how ITK defines image geometry (origin, spacing, directions): https://itk.org/ITKSoftwareGuide/html/Book1/ITKSoftwareGuide-Book1ch4.html#x45-540004.1.4

lassoan avatar Feb 01 '22 17:02 lassoan

Thanks for providing the link @lassoan! I read this now and in light of yesterdays discussion I think that we should not add it to 0.4, but instead tackle this in 0.5, given that @bogovicj will also make the definition of spaces and data-spaces much more explicit than it currently is. With this the origin definition will make much more sense.

If anyone thinks we need a sentence on it right now, please go ahead and propose changes in a PR; I am not opposed to adding it for v0.4 (or to make a v0.4.1 with just that change), but I can't find a good place to put this in right now without the more explicit space definition.

constantinpape avatar Feb 02 '22 08:02 constantinpape

So I think we should use the pixel center as origin here as well.

Yes, agreed.

Besides, there's one center while 4, 8, ? corners... and one would have to specify which corner

xulman avatar Feb 02 '22 09:02 xulman

The corner would surely be the start (i.e. top left) along each dimension, so I don't think there is an issue there.

Though I suppose you could in theory allow a choice of center vs "corner" (i.e. boundary) independently for each dimension.

I have a vague intuition that if the data will be displayed using interpolation then center is a more natural choice, while if the data will be displayed without interpolation (i.e. "pixelated") then corner may be a more natural choice.

For example:

Suppose we have a 1-d array of size 10 at a resolution of 4nm.

If we say integer coordinates are at the corner, then our data corresponds to the continuous physical range [0nm, 40nm] assuming we don't apply any translation.

If we instead had an array of size 5 at a resolution of 8nm, then our data would still correspond to the continuous physical range [0nm, 40nm].

If we say integer coordinates are at the center, then our data corresponds to the continuous physical range [-2nm, 38nm] if we don't apply any translation. If we instead had an array of size 5 at a resolution of 8nm, then we would instead get by default a continuous physical range [-4nm, 36nm].

On the other hand, if we are always interpolating, and will exclude the outer half of the begin and end pixels, then our array of 10 4nm pixels would correspond to the physical range [0, 36nm] and our array of 5 8nm pixels would correspond to the physical range [0, 32nm] which is maybe a bit more intuitive.

When I made the choice in Neuroglancer to use corner rather than center it was more a natural result of how the transforms were implemented rather than a conscious decision, and then changing it later was not an option for backwards compatibility reasons, so I'm curious what others think about this.

jbms avatar Feb 02 '22 19:02 jbms

Top left is does correspond to how most visualization tools orient their rendering. However, bottom left is how most processing tools orient themselves. By using the center, we avoid these issues.

We want to support processing and rendering in these non-"pixelated" methods, e.g. @tpietzsch demo'ed rendering a splat-type rendering with BigDataViewer. Pixelated, interpolated-pixelated, non-pixelated, processing can all be supported when the transform applies to the pixel center. Considering all use cases, complexities and dependencies on the size of a pixel and which corner is taken to be the start are removed.

thewtex avatar Feb 03 '22 13:02 thewtex

@thewtex I'm a bit confused when you say "bottom left". Certainly it is fairly common for the origin of a coordinate space to correspond to the "bottom left" corner of the entire image/screen, i.e. the x axis goes left to right, and the y axis goes bottom to top. That is how OpenGL window coordinates are defined, for example.

However, I think here we are talking about the origin within an individual pixel/voxel, not of the entire coordinate space.

Let say we have a zarr array of shape (4, 3). The point label O is contained within pixel (0, 0), while the point labeled Z is contained within pixel (3, 2).

A      B     C     D      E
 +-----+-----+-----+-----+
 |     |     |     |     |
 |  O  |  P  |  Q  |  R  |
 |     |     |     |     |
N+-----+-----+-----+-----+F
 |     |     |     |     |
 |  S  |  T  |  U  |  V  |
 |     |     |     |     |
M+-----+-----+-----+-----+G
 |     |     |     |     |
 |  W  |  X  |  Y  |  Z  |
 |     |     |     |     |
 +-----+-----+-----+-----+
L      K     J     I      H

In terms of the continuous coordinate space, I would say reasonable choices of the (0, 0) origin are A and O.

I think the choice of pixel origin is independent of the choice of which screen direction should by default correspond to each dimension of the coordinate space; that could perhaps be indicated by separate metadata. For example, if the diagram were flipped vertically, but we are still assuming that point O is contained in pixel (0, 0) of the zarr array:

L      K     J     I      H
 +-----+-----+-----+-----+
 |     |     |     |     |
 |  W  |  X  |  Y  |  Z  |
 |     |     |     |     |
M+-----+-----+-----+-----+G
 |     |     |     |     |
 |  S  |  T  |  U  |  V  |
 |     |     |     |     |
N+-----+-----+-----+-----+F
 |     |     |     |     |
 |  O  |  P  |  Q  |  R  |
 |     |     |     |     |
 +-----+-----+-----+-----+
A      B     C     D      E

Then I would still say the reasonable choices of the (0, 0) origin in the continuous coordinate space are A and O.

jbms avatar Feb 03 '22 14:02 jbms

In radiology (and in 3D medical imaging software libraries and applications I know), the debates around pixel corner/center took place in the early 2000s and the community standardized on pixel center. The decision was not contested later and in general everyone is happy with it.

Image is a continuous signal that can be reconstructed flawlessly from the discrete samples stored in the voxel values (as long as the Nyquist sampling criterion was respected). Therefore pixelated display is unnecessary, and it can be also simply considered incorrect, because we know that the original signal can be reconstructed using a low-pass filter yet we construct a signal using a zero-order hold. Some people switch to pixelated display because they want to see the voxel boundaries, but this goal can be achieved much better by overlaying a grid (the grid always shows voxel boundaries clearly, regardless of brightness/contrast settings and intensity difference between neighbor voxels).

then changing it later was not an option for backwards compatibility reasons, so I'm curious what others think about this.

I can tell about a similar issue that we have in 3D Slicer. At the time when the application was designed, image axis orientation in radiology software was still not standardized. Slicer chose RAS, while over the years the rest of the radiology world ended up using LPS. Slicer kept using RAS for backward compatibility reasons. As time went on, we encountered more and more issues due to this inconsistency, but as more and more features were added, more data was generated, the larger the community grew, switching to the standard convention just got so much more complex and bigger task that it has remained out of our reach. With several years of careful work we managed to switch to standard LPS in all the files, but internally we still use RAS. This is a source of lots of complications, potential errors, and user confusion. It would have been much better to switch many years ago, when everything was still smaller and simpler.

This is of course just one example, but it illustrates how you can get into a tough spot if your software diverges from common conventions in your field. The change to using pixel center when representing 3D images with a 3D array may look hard and/or unnecessary, but in the long term it will be just harder and it may turn out that the change is practically unavoidable (because you just need to spend too much time debugging and fixing errors related to half-voxel offset errors and explaining to users and developers why they need to add half voxel offsets here and there when they use or write plugins for your software).

lassoan avatar Feb 03 '22 14:02 lassoan

@jbms yes, that's right -- we want to specify O vs A. While we avoid any confusion with the orientation of the coordinate, we also support processing as @lassoan pointed out.

thewtex avatar Feb 04 '22 21:02 thewtex

Here's a blurb I wrote re this definition that I hope will be in the next version:

Coordinate convention

The pixel/voxel center is the origin of the continuous coordinate system.

It is vital to consistently define relationship between the discrete / arrray and continuous / interpolated coordinate systems. A pixel / voxel is the continuous region that corresponds to a single sample in the discrete array, i.e., the area corresponding to nearest-neighbor (NN) interpolation of that sample. The center of a 2d pixel corresponding to the origin (0,0) in the discrete array is the origin of the continuous space (0.0, 0.0) (when the transformation is the identity). The continuous rectangle of the pixel is given by the half-open interval [-0.5, 0.5) x [-0.5, 0.5). See chapter 4 and figure 4.1 of the ITK Software Guide [[itk]].

bogovicj avatar May 12 '22 21:05 bogovicj