rioxarray
rioxarray copied to clipboard
Unexpected behaviour when modifying coords with `assign_coords`
Code Sample, a copy-pastable example if possible
import rioxarray as rxr
da = rxr.open_rasterio("https://huggingface.co/datasets/Zeel/tmp/resolve/main/1524-1184.tif")
# save locally
da.rio.to_raster("tmp.tif")
# Load
da = rxr.open_rasterio("tmp.tif")
print("Original values", da.x.values[:5])
da = da.assign_coords(x = np.round(da.x, 2))
print("Modified values before saving", da.x.values[:5])
# Save
da.rio.to_raster("tmp2.tif")
# Reload
da = rxr.open_rasterio("tmp2.tif")
print("Modified values after saving and reloading", da.x.values[:5])
Output
Original values [9783942.00780707 9783946.78512134 9783951.56243561 9783956.33974987
9783961.11706414]
Modified values before saving [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
Modified values after saving and reloading [9783942.01 9783946.7873138 9783951.5646276 9783956.34194139
9783961.11925519]
Expected Output
Original values [9783942.00780707 9783946.78512134 9783951.56243561 9783956.33974987
9783961.11706414]
Modified values before saving [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
Modified values after saving and reloading [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
Environment Information
Installed fresh in Google colab with pip install rioxarray
Question
If this is not a recommended way to modify the coordinates, please help me with the recommended way.
With the change in coordinates, your dx/dy are no longer evenly spaced:
da = rioxarray.open_rasterio("tmp.tif")
print("Original values", da.x.values[:5])
print("DX", da.x.values[:5]-da.x.values[1:6])
da = da.assign_coords(x = numpy.round(da.x, 2))
print("Modified values before saving", da.x.values[:5])
print("DX", da.x.values[:5]-da.x.values[1:6])
Original values [9783942.00780707 9783946.78512134 9783951.56243561 9783956.33974987
9783961.11706414]
DX [-4.77731427 -4.77731427 -4.77731427 -4.77731427 -4.77731427]
Modified values before saving [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
DX [-4.78 -4.77 -4.78 -4.78 -4.77]
After saving the raster, the new coords are again evenly spaced:
# Save
da.rio.to_raster("tmp2.tif")
# Reload
da = rioxarray.open_rasterio("tmp2.tif")
print("Modified values after saving and reloading", da.x.values[:5])
print("DX", da.x.values[:5]-da.x.values[1:6])
Modified values after saving and reloading [9783942.01 9783946.7873138 9783951.5646276 9783956.34194139
9783961.11925519]
DX [-4.7773138 -4.7773138 -4.7773138 -4.7773138 -4.7773138]
Thank you for the response, @snowman2! Now, I understand what's going on. Actually, my use case is like the following:
- I am trying to merge multiple
tiffiles withxr.open_mfdataset. Their coordinates are similar, but floating-point precision results in a non-monotonic final index. So, I wanted to round the coordinates to make sure all similar coordinates become exactly the same. Is there a better way to achieve the same instead of what I have done above?
Is there a better way to achieve the same instead of what I have done above?
I recommend referring to https://github.com/corteva/rioxarray/blob/fa35e916e41d785b0a57e0d5dce6189660b4ae3d/rioxarray/_io.py#L848-L891.
In that code, it only adds coordinates for one of the data arrays and then the other data arrays in the dataset inherit the coordinates.
Thank you for the reference, @snowman2, but I couldn't fully understand what you are trying to convey. I have multiple clusters of files where, in each cluster, coordinates are very similar, with just floating point differences. I'd appreciate a lot if you could provide a code/pseudo-code of how to achieve this.
this sounds related to https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140
there's a draft pr in xarray with suggestions on how rioxarray could be changed to not materialize coordinates and introduce floating point imprecision https://github.com/pydata/xarray/pull/9543