odc-geo
odc-geo copied to clipboard
Spurious NotGeoreferencedWarning during reproject
There is a warning coming out of rasterio
when performing MEM -> MEM reproject, there seem to be no visible issues on the output. It could be an issue inside rasterio
, or it could be due to inputs odc-geo
provides to rasterio
. Seems to be more common when using Dask, so could be due to chunking.
Reported in https://github.com/opendatacube/odc-stac/issues/145
Yep, this happens with datacube.load
too. It's a really annoying warning, and seems to be quite random.
I'm getting this as well. Here's an MRE
import pystac_client
import numpy as np
api_url = "https://earth-search.aws.element84.com/v1"
collection_id = "sentinel-2-c1-l2a"
bbox = np.array([27.68375 , 35.875969, 28.247358, 36.458195])
client = pystac_client.Client.open(api_url)
search = client.search(
collections=collection_id,
datetime="2023-07-01/2023-08-31",
bbox=bbox
)
item_collection = search.item_collection()
import odc.stac
ds = odc.stac.load(
item_collection,
groupby='solar_day',
chunks={'x': 2048, 'y': 2048},
use_overviews=True,
resolution=20,
bbox=bbox,
)
ds
red = ds['red']
nir = ds['nir']
scl = ds['scl']
# generate mask ("True" for pixel being cloud or water)
mask = scl.isin([
3, # CLOUD_SHADOWS
6, # WATER
8, # CLOUD_MEDIUM_PROBABILITY
9, # CLOUD_HIGH_PROBABILITY
10 # THIN_CIRRUS
])
red_masked = red.where(~mask)
nir_masked = nir.where(~mask)
ndvi = (nir_masked - red_masked) / (nir_masked + red_masked)
ndvi_before = ndvi.sel(time="2023-07-13")
ndvi_before.plot()
/Users/ryanavery/test-dask-on-ray/.venv/lib/python3.11/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
dest = _reproject(
Looking into this with a debugger I see that eventually in dask locals.py, only one of the tasks is triggering the warning and it seems to happen when NIR is processed after reading. So the MRE above can be simplified to
import pystac_client
import numpy as np
api_url = "https://earth-search.aws.element84.com/v1"
collection_id = "sentinel-2-c1-l2a"
bbox = np.array([27.68375 , 35.875969, 28.247358, 36.458195])
client = pystac_client.Client.open(api_url)
search = client.search(
collections=collection_id,
datetime="2023-07-01/2023-08-31",
bbox=bbox
)
item_collection = search.item_collection()
import odc.stac
ds = odc.stac.load(
item_collection,
groupby='solar_day',
chunks={'x': 2048, 'y': 2048},
use_overviews=True,
resolution=20,
bbox=bbox,
)
ds
red = ds['red']
nir = ds['nir']
scl = ds['scl']
nir.compute()
The warning does not occur if I compute the scl time series.
What's really strange is that if I run
scl.compute()
nir.compute()
without restarting the kernel, I don't get the warning. the warning only occurs if running the example with nir.compute()
or red.compute()
end to end in a fresh kernel. Those two lines need to be run in separate cells to not show the warning in either because of some async behavior I think.
https://github.com/rasterio/rasterio/issues/2497
Probably related, I think related to Rasterio not hiding some warnings in some specific circumstances
I think it's occuring in the dask cog writer when it opens some mem datasets, been having fun exploring to find it .
Also rasterio only warns once per session if you open a bare new dataset, which might explain the transience
Probably when creating output dataset somewhere inside rasterio
Is there any way we can catch/suppress this inside odc-*
and datacube
? Although it seems to have no impact, the spam it produces does impact user experience - particularly for new/beginner users who freak out when they see any kind of warning...
I've been slowly pivoting in trying to find where it might happen, this is the latest clue I had fwiw:
https://gist.github.com/mdsumner/55bb0708c3eeaeeb20a290c96fcc4ce6?permalink_comment_id=5219757#gistcomment-5219757
(it's not a reprex but any input tif should do) This has been a good focus for me to explore what's going on in odc 🙏 - but ultimately my guess is that rasterio should a allow a "warning suppression" option.