rioxarray
rioxarray copied to clipboard
Writing a large tiff without specifying BIGTIFF="YES" silently fails writing some blocks
Code Sample, a copy-pastable example if possible
A "Minimal, Complete and Verifiable Example" will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
import xarray as xr
import dask.array as da
import rioxarray as rio
size = (30_000, 60_000)
data = xr.DataArray(
data = da.random.random(size),
coords={'y':np.linspace(0, size[0]*10, size[0]), 'x':np.linspace(0, size[1]*10, size[1])},
dims=('y', 'x'),
)
data = data.rio.set_crs(3857)
data[::100, ::100].plot()
# you should get something like the image in Expected Output
data.rio.to_raster('test.tif', COMPRESS="DEFLATE")
rio.open_rasterio('test.tif', chunks='auto', parallel=True, lock=False).isel(band=0)[::100, ::100].plot()
# you should get something partial the image in Problem Description
Problem description
I came across this issue recently, and seems it is linked to using COMPRESS="DEFLATE".
If running the code above, saving the image succeeds with no issue or warning raised.
However, upon opening the image it looks partial.
If performing the same exact operation using rasterio, instead I get this error.
https://gis.stackexchange.com/questions/368251/error-occurred-while-writing-dirty-block-from-gdalrasterbandirasterio
This as the post explains it is linked to not specify BIGTIFF="YES"
Expected Output
Either a correctly saved image, or the error being raised
Environment Information
python -c "import rioxarray; rioxarray.show_versions()"
Python version : 3.10.12
Platform : Linux
xarray : 2023.10.1
pandas : 2.1.1
dask : 2023.10.0
numpy : 1.23.4
rasterio : 1.3.9
rioxarray : 0.15.0
geopandas : 0.14.0
shapely : 2.0.2
zarr : 2.16.1
matplotlib : 3.8.0
cartopy : 0.22.0
nbic_utils : 2.0.0
xrutils : 2.0.0
Installation method
pypi
This is likely due to using a dask array when writing as it uses a different writing mechanism. Do you run into this issue with a numpy array?
I also have had this issue - the silent failing seems related to using dask
I think I have seen with rasterio, too...will just write 4GB worth and rest is empty.
I am guessing this is related: https://github.com/corteva/rioxarray/issues/220 See: https://corteva.github.io/rioxarray/latest/examples/dask_read_write.html
From: https://gdal.org/drivers/raster/gtiff.html
Default: BIGTIFF=IF_NEEDED
Description: "will only create a BigTIFF if it is clearly needed (in the uncompressed case, and image larger than 4GB. So no effect when using a compression)."
In your example, COMPRESS="DEFLATE". So, you need to set BIGTIFF=TES for it to work successfully. In order for a more explicit error message, GDAL is where the change likely would need to happen.