rioxarray icon indicating copy to clipboard operation
rioxarray copied to clipboard

Bug in Geotiff bounds calculation when asymmetric shearing in geotransform

Open lantson opened this issue 8 months ago • 5 comments

Problem description

Hey! I have a Geotiff that I'm reading in using rioxarray.open_rasterio(). When calculating the bounds using .rio.bounds(), I spotted that the returned bounds are not correct for my specific case.

After digging a bit into the source code (rioxarray.py), I saw that in order to calculate the bounds of the Geotiff, the resolution is used and computed using these two functions:

def _affine_has_rotation(affine: Affine) -> bool:
    """
    Determine if the affine has rotation.

    Parameters
    ----------
    affine: :obj:`affine.Affine`
        The affine of the grid.

    Returns
    -------
    bool
    """
    return affine.b == affine.d != 0

def _resolution(affine: Affine) -> Tuple[float, float]:
    """
    Determine if the resolution of the affine.
    If it has rotation, the sign of the resolution is lost.

    Based on: https://github.com/mapbox/rasterio/blob/6185a4e4ad72b5669066d2d5004bf46d94a6d298/rasterio/_base.pyx#L943-L951

    Parameters
    ----------
    affine: :obj:`affine.Affine`
        The affine of the grid.

    Returns
    --------
    x_resolution: float
        The X resolution of the affine.
    y_resolution: float
        The Y resolution of the affine.
    """
    if not _affine_has_rotation(affine):
        return affine.a, affine.e
    return (
        math.sqrt(affine.a**2 + affine.d**2),
        math.sqrt(affine.b**2 + affine.e**2),
    )

Depending on whether the affine matrix (the geotransform) has a rotation or not, the resolution is calculated in a different way. Now, I think that the _affine_has_rotation function is too restrictive because it only returns True when both affine.b and affine.d are non-zero and equal. Currently, it doesn't account for cases where there is shear. In the case of asymmetric shearing in the affine matrix, the function will return False because the affine.b == affine.d != 0 statement will always result in False since the b and d components of the affine matrix are not equal. And I think that this is the reason why the resolution, and therefore the bounds are incorrectly calculated.

I've also compared the way that Rioxarray computes the bounds with the implementation from Rasterio (https://github.com/rasterio/rasterio/blob/6185a4e4ad72b5669066d2d5004bf46d94a6d298/rasterio/_base.pyx#L921-L941) and it seems like the issue only appears in Rioxarray's version of the code:

print(my_image.image.rio.bounds())
# results in: (-106.41784875598337 37.937050158755014 -106.48800896101064 38.011070395055796) # incorrect

with rasterio.open(path_to_my_image) as src:
    print(src.bounds)
# results in: (-106.48800896101064, 37.59948038897249, -105.85343430042428, 38.011070395055796) # correct

I've put a minimal reproducible example below that you can just copy-paste, so it becomes easier to debug. Please let me know if you need more information. Thank you very much for your time and efforts.

Minimal reproducible example

import numpy as np
import rasterio
import rioxarray
from rasterio.transform import Affine

transform = Affine(
    -4.6432961632872404e-05,
    0.0002809429843499661,
    -106.41784875598337,
    -0.00022340818648744067,
    -3.684431871616994e-05,
    38.011070395055796,
    0.0,
    0.0,
    1.0
)

height = 2009
width = 1511

data = np.ones((1, height, width), dtype=np.float32)

profile = {
    'driver': 'GTiff',
    'dtype': rasterio.float32,
    'width': width,
    'height': height,
    'count': 1,
    'crs': 4326,
    'transform': transform
}

with rasterio.open('my_geotif.tif', 'w', **profile) as dst:
    dst.write(data)

incorrect_bounds = rioxarray.open_rasterio('my_geotif.tif').rio.bounds()
print("incorrect_bounds: ", incorrect_bounds)

with rasterio.open('my_geotif.tif', 'r') as src:
    correct_bounds = src.bounds
print("correct bounds:", correct_bounds)

Expected Output

A fix for calculating the bounds in cases where the geotransform has asymmetric shearing.

Environment Information

  • rioxarray version: (0.12.4)
  • rasterio version (1.4.3)
  • GDAL version (3.9.3)
  • Python version (3.10.1)
  • Operation System Information (Linux-6.8.0-48-generic-x86_64-with-glibc2.39)

Installation method

  • pypi

lantson avatar Mar 04 '25 11:03 lantson

@lantson @snowman2

I have noticed, and raised before (somewhere I think) that computing resolution for rotated geoboxes is not done correctly in rasterio and rioxarray.

It works correctly in odc.geo:

import odc.geo.xr

print(rioxarray.open_rasterio('my_geotif.tif').odc.geobox.resolution)
print(rioxarray.open_rasterio('my_geotif.tif').odc.geobox.boundingbox.bbox)
# >>> Resolution(x=0.000228182, y=0.000282562)
# >>> (-106.41784875598337, 37.59948038897249, -105.92359450545156, 38.011070395055796)

but requires a more complex approach of first separating Affine transform into separate Rotation, Shear and Scale affine matrices.

https://github.com/opendatacube/odc-geo/blob/cd54417860c992eff34d3c4ab37c3fdc13caf26f/odc/geo/math.py#L458-L479

def decompose_rws(A: AffineX) -> Tuple[AffineX, AffineX, AffineX]:
    """
     Compute decomposition Affine matrix sans translation into Rotation, Shear and Scale.

     Find matrices ``R,W,S`` such that ``A = R W S`` and

     .. code-block:

        R [ca -sa]  W [1, w]  S [sx,  0]
          [sa  ca]    [0, 1]    [ 0, sy]

    .. note:

       There are ambiguities for negative scales.

       * ``R(90)*S(1,1) == R(-90)*S(-1,-1)``

       * ``(R*(-I))*((-I)*S) == R*S``


     :return: Rotation, Shear, Scale ``2x2`` matrices
    """

Kirill888 avatar Mar 04 '25 21:03 Kirill888

Hey @Kirill888,

Thank you for your response.

It seems that each library produces different results for the bounds. I've plotted the bounding boxes in QGIS alongside the GeoTIFF of interest. It appears that both rioxarray and ODC-Geo produce incorrect results, and that Rasterio is the only one able to correctly determine the bounds. (See image).

So the implementation in ODC-Geo might be broken too?

Image

lantson avatar Mar 05 '25 08:03 lantson

heh, correct, see issue above

this works though:

print(rioxarray.open_rasterio("my_geotif.tif").odc.geobox.extent.boundingbox.bbox)

Kirill888 avatar Mar 05 '25 09:03 Kirill888

I have noticed, and raised before (somewhere I think) that computing resolution for rotated geoboxes is not done correctly in rasterio and rioxarray.

There are indeed a fair number of open issues in rasterio on proper handling of rotated rasters. This one is a somewhat recent summary https://github.com/rasterio/rasterio/issues/3179

Currently, it doesn't account for cases where there is shear. In the case of asymmetric shearing in the affine matrix, the function will return False because the affine.b == affine.d != 0 statement will always result in False since the b and d components of the affine matrix are not equal. And I think that this is the reason why the resolution, and therefore the bounds are incorrectly calculated.

Rasterio is the only one able to correctly determine the bounds.

That said if rasterio is giving the correct approach for this case (and now odc-geo as well), it could be worth modifying these functions in rioxarray to match and seeing that all the tests pass!

scottyhq avatar Apr 09 '25 09:04 scottyhq

It could be worth modifying these functions in rioxarray to match and seeing that all the tests pass!

Contributions welcome!

snowman2 avatar Apr 09 '25 11:04 snowman2