odc-geo icon indicating copy to clipboard operation
odc-geo copied to clipboard

Spurious NotGeoreferencedWarning during reproject

Open Kirill888 opened this issue 1 year ago • 1 comments

There is a warning coming out of rasterio when performing MEM -> MEM reproject, there seem to be no visible issues on the output. It could be an issue inside rasterio, or it could be due to inputs odc-geo provides to rasterio. Seems to be more common when using Dask, so could be due to chunking.

Reported in https://github.com/opendatacube/odc-stac/issues/145

Kirill888 avatar Feb 23 '24 01:02 Kirill888

Yep, this happens with datacube.load too. It's a really annoying warning, and seems to be quite random.

robbibt avatar Feb 28 '24 23:02 robbibt

I'm getting this as well. Here's an MRE

import pystac_client
import numpy as np
api_url = "https://earth-search.aws.element84.com/v1"
collection_id = "sentinel-2-c1-l2a"
bbox = np.array([27.68375 , 35.875969, 28.247358, 36.458195])
client = pystac_client.Client.open(api_url)
search = client.search(
    collections=collection_id,
    datetime="2023-07-01/2023-08-31",
    bbox=bbox
)

item_collection = search.item_collection()

import odc.stac
ds = odc.stac.load(
    item_collection,
    groupby='solar_day',
    chunks={'x': 2048, 'y': 2048},
    use_overviews=True,
    resolution=20,
    bbox=bbox,
)

ds

red = ds['red']
nir = ds['nir']
scl = ds['scl']

# generate mask ("True" for pixel being cloud or water)
mask = scl.isin([
    3,  # CLOUD_SHADOWS
    6,  # WATER
    8,  # CLOUD_MEDIUM_PROBABILITY
    9,  # CLOUD_HIGH_PROBABILITY
    10  # THIN_CIRRUS
])
red_masked = red.where(~mask)
nir_masked = nir.where(~mask)

ndvi = (nir_masked - red_masked) / (nir_masked + red_masked)

ndvi_before = ndvi.sel(time="2023-07-13")
ndvi_before.plot()
/Users/ryanavery/test-dask-on-ray/.venv/lib/python3.11/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  dest = _reproject(

rbavery avatar Oct 08 '24 19:10 rbavery

Looking into this with a debugger I see that eventually in dask locals.py, only one of the tasks is triggering the warning and it seems to happen when NIR is processed after reading. So the MRE above can be simplified to

import pystac_client
import numpy as np
api_url = "https://earth-search.aws.element84.com/v1"
collection_id = "sentinel-2-c1-l2a"
bbox = np.array([27.68375 , 35.875969, 28.247358, 36.458195])
client = pystac_client.Client.open(api_url)
search = client.search(
    collections=collection_id,
    datetime="2023-07-01/2023-08-31",
    bbox=bbox
)

item_collection = search.item_collection()

import odc.stac
ds = odc.stac.load(
    item_collection,
    groupby='solar_day',
    chunks={'x': 2048, 'y': 2048},
    use_overviews=True,
    resolution=20,
    bbox=bbox,
)

ds

red = ds['red']
nir = ds['nir']
scl = ds['scl']

nir.compute()

The warning does not occur if I compute the scl time series.

What's really strange is that if I run

scl.compute()
nir.compute()

without restarting the kernel, I don't get the warning. the warning only occurs if running the example with nir.compute() or red.compute() end to end in a fresh kernel. Those two lines need to be run in separate cells to not show the warning in either because of some async behavior I think.

rbavery avatar Oct 08 '24 21:10 rbavery

https://github.com/rasterio/rasterio/issues/2497

Probably related, I think related to Rasterio not hiding some warnings in some specific circumstances

Kirill888 avatar Oct 09 '24 03:10 Kirill888

I think it's occuring in the dask cog writer when it opens some mem datasets, been having fun exploring to find it .

Also rasterio only warns once per session if you open a bare new dataset, which might explain the transience

mdsumner avatar Oct 09 '24 03:10 mdsumner

Probably when creating output dataset somewhere inside rasterio

Kirill888 avatar Oct 09 '24 07:10 Kirill888

Is there any way we can catch/suppress this inside odc-* and datacube? Although it seems to have no impact, the spam it produces does impact user experience - particularly for new/beginner users who freak out when they see any kind of warning...

robbibt avatar Oct 09 '24 07:10 robbibt

I've been slowly pivoting in trying to find where it might happen, this is the latest clue I had fwiw:

https://gist.github.com/mdsumner/55bb0708c3eeaeeb20a290c96fcc4ce6?permalink_comment_id=5219757#gistcomment-5219757

(it's not a reprex but any input tif should do) This has been a good focus for me to explore what's going on in odc 🙏 - but ultimately my guess is that rasterio should a allow a "warning suppression" option.

mdsumner avatar Oct 09 '24 07:10 mdsumner