stackstac icon indicating copy to clipboard operation
stackstac copied to clipboard

Enable opening datasets with gcps even if no valid crs is present

Open weiji14 opened this issue 2 years ago • 6 comments

Some datasets like Sentinel-1 GRD files don't have a coordinate reference system (crs), but they may have ground control points (gcps) and can still be opened by rasterio.

Test this branch using pip install git+https://github.com/weiji14/stackstac.git@open_vrt_with_gcps

Fixes #181.

weiji14 avatar Sep 23 '22 18:09 weiji14

Probably should write a regression test. @scottyhq, any good sample Sentinel-1 GRD files with a stable link that you know of?

weiji14 avatar Sep 23 '22 18:09 weiji14

Probably should write a regression test. @scottyhq, any good sample Sentinel-1 GRD files with a stable link that you know of?

GRDs are really big. xarray-sentinel has a minimized one here for testing https://github.com/bopen/xarray-sentinel/tree/main/tests/data/S1B_IW_GRDH_1SDV_20210401T052623_20210401T052648_026269_032297_ECC8.SAFE.

Or rioxarray just constructs a synthetic dataset with GCPs for a test: https://github.com/corteva/rioxarray/blob/bfd616594bb82886133b5f4b9356ed16d39014fc/test/integration/test_integration_rioxarray.py#L1053

scottyhq avatar Sep 23 '22 19:09 scottyhq

@weiji14 did you get any further with this? Is this actually working as is now?

gjoseph92 avatar Oct 24 '22 17:10 gjoseph92

@weiji14 did you get any further with this? Is this actually working as is now?

Technically, I think the implementation is sound. But the unit test could be much improved. Ok if you want to merge this first and improve the unit test at a later date.

weiji14 avatar Oct 24 '22 17:10 weiji14

@weiji14 I pushed a change here to avoid mutating spec.vrt_params, since that object could possibly be shared with other dask tasks.

I'm not seeing any new commits in this PR. Could you try again?

To be clear, in your example in #181, you're still getting the purple/green/yellow output? That makes me think this might not be working correctly yet?

If I recall correctly, there's actually 2 things needed to fix the purple/green/yellow output.

  1. Getting the image plotted at (roughly) the correct location, and this requires reading the coordinate reference system (CRS) and ground control points (GCP) which this Pull Request has fixed (at least I think it's implemented correctly).
  2. Resampling/Interpolation - this is something this PR won't be handling, but which the user would still need to set to get a proper Sentinel-1 GRD image plotted (that is not purple/green/yellow).

I'm having trouble with the second part (resampling), which would require someone a bit smarter on the SAR data structure to get right (e.g. @scottyhq). I wish I had a proper end-to-end validated example where someone has taken a Sentinel-1 GRD image (from Microsoft Planetary Computer), read it properly with rasterio (with some set of parameters) and plotted it, but I'm missing that picture right now.

weiji14 avatar Oct 24 '22 20:10 weiji14

Oops maybe I forgot to push. I see them now in https://github.com/gjoseph92/stackstac/pull/182/files/1b3c47c6bd0d3ec9dd17b25dc8cb1c8496a131a7..d7b08203ab754295bf5644fa0a661df9fa94e8b9.

I'd be a little surprised to find that it's just a resampling issue. As long as the data types are right and we're not truncating floats to ints or something, I don't quite see why a different resampling method would change meaningful data into something completely meaningless-looking.

Furthermore, if you look at the example scene from https://github.com/gjoseph92/stackstac/issues/181 in PC explorer, you can see that what we're seeing from stackstac.show also has a different spatial extent, as well as looking meaningless. I suppose that could be a STAC metadata issue like https://github.com/gjoseph92/stackstac/issues/152 though (I haven't looked at the metadata at all).

I guess where I'm leaning right now is that if we can't get this to seem to work properly and return meaningful data, I'd rather raise a NotImplementedError if ds.crs is None than try to use GCPs but possibly return bad values.

gjoseph92 avatar Oct 24 '22 21:10 gjoseph92