stackstac
stackstac copied to clipboard
Enable opening datasets with gcps even if no valid crs is present
Some datasets like Sentinel-1 GRD files don't have a coordinate reference system (crs), but they may have ground control points (gcps) and can still be opened by rasterio.
Test this branch using pip install git+https://github.com/weiji14/stackstac.git@open_vrt_with_gcps
Fixes #181.
Probably should write a regression test. @scottyhq, any good sample Sentinel-1 GRD files with a stable link that you know of?
Probably should write a regression test. @scottyhq, any good sample Sentinel-1 GRD files with a stable link that you know of?
GRDs are really big. xarray-sentinel
has a minimized one here for testing https://github.com/bopen/xarray-sentinel/tree/main/tests/data/S1B_IW_GRDH_1SDV_20210401T052623_20210401T052648_026269_032297_ECC8.SAFE.
Or rioxarray just constructs a synthetic dataset with GCPs for a test: https://github.com/corteva/rioxarray/blob/bfd616594bb82886133b5f4b9356ed16d39014fc/test/integration/test_integration_rioxarray.py#L1053
@weiji14 did you get any further with this? Is this actually working as is now?
@weiji14 did you get any further with this? Is this actually working as is now?
Technically, I think the implementation is sound. But the unit test could be much improved. Ok if you want to merge this first and improve the unit test at a later date.
@weiji14 I pushed a change here to avoid mutating
spec.vrt_params
, since that object could possibly be shared with other dask tasks.
I'm not seeing any new commits in this PR. Could you try again?
To be clear, in your example in #181, you're still getting the purple/green/yellow output? That makes me think this might not be working correctly yet?
If I recall correctly, there's actually 2 things needed to fix the purple/green/yellow output.
- Getting the image plotted at (roughly) the correct location, and this requires reading the coordinate reference system (CRS) and ground control points (GCP) which this Pull Request has fixed (at least I think it's implemented correctly).
- Resampling/Interpolation - this is something this PR won't be handling, but which the user would still need to set to get a proper Sentinel-1 GRD image plotted (that is not purple/green/yellow).
I'm having trouble with the second part (resampling), which would require someone a bit smarter on the SAR data structure to get right (e.g. @scottyhq). I wish I had a proper end-to-end validated example where someone has taken a Sentinel-1 GRD image (from Microsoft Planetary Computer), read it properly with rasterio
(with some set of parameters) and plotted it, but I'm missing that picture right now.
Oops maybe I forgot to push. I see them now in https://github.com/gjoseph92/stackstac/pull/182/files/1b3c47c6bd0d3ec9dd17b25dc8cb1c8496a131a7..d7b08203ab754295bf5644fa0a661df9fa94e8b9.
I'd be a little surprised to find that it's just a resampling issue. As long as the data types are right and we're not truncating floats to ints or something, I don't quite see why a different resampling method would change meaningful data into something completely meaningless-looking.
Furthermore, if you look at the example scene from https://github.com/gjoseph92/stackstac/issues/181 in PC explorer, you can see that what we're seeing from stackstac.show
also has a different spatial extent, as well as looking meaningless. I suppose that could be a STAC metadata issue like https://github.com/gjoseph92/stackstac/issues/152 though (I haven't looked at the metadata at all).
I guess where I'm leaning right now is that if we can't get this to seem to work properly and return meaningful data, I'd rather raise a NotImplementedError
if ds.crs is None
than try to use GCPs but possibly return bad values.