xarray-sentinel icon indicating copy to clipboard operation
xarray-sentinel copied to clipboard

Support opening datasets from STAC items

Open TomAugspurger opened this issue 2 years ago • 2 comments

Currently, users of xarray-sentinel pass the path / URL to a manifest.safe file, which is used to discover everything necessary to build the xarray Dataset.

I'm curious whether xarray-sentinel might be able to work with a STAC item that has all the relevant information, and skip reading the manifest.safe file? The STAC items generated by https://github.com/stactools-packages/sentinel5p has almost all the information that's returned by https://github.com/bopen/xarray-sentinel/blob/afd6fa247ff75a851fc0a7d4e18acddbcd4af625/xarray_sentinel/esa_safe.py#L107

{
  "type": "Feature",
  "stac_version": "1.0.0",
  "id": "S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7",
  "properties": {
    "sar:frequency_band": "C",
    "sar:center_frequency": 5.405,
    "sar:observation_direction": "right",
    "sar:instrument_mode": "IW",
    "sar:polarizations": [
      "VV",
      "VH"
    ],
    "sar:product_type": "GRD",
    "sar:resolution_range": 20,
    "sar:resolution_azimuth": 22,
    "sar:pixel_spacing_range": 10,
    "sar:pixel_spacing_azimuth": 10,
    "sar:looks_range": 5,
    "sar:looks_azimuth": 1,
    "sar:looks_equivalent_number": 4.4,
    "sat:platform_international_designator": "2014-016A",
    "sat:orbit_state": "descending",
    "sat:absolute_orbit": 41394,
    "sat:relative_orbit": 22,
    "providers": [
      {
        "name": "ESA",
        "roles": [
          "producer",
          "processor",
          "licensor"
        ],
        "url": "https://earth.esa.int/web/guest/home"
      },
      {
        "name": "Microsoft",
        "roles": [
          "host",
          "processor"
        ],
        "url": "https://planetarycomputer.microsoft.com"
      }
    ],
    "platform": "SENTINEL-1A",
    "constellation": "Sentinel-1",
    "start_datetime": "2022-01-10 05:09:22.400645+00:00",
    "end_datetime": "2022-01-10 05:09:47.399217+00:00",
    "s1:instrument_configuration_ID": "7",
    "s1:datatake_id": "322551",
    "s1:product_timeliness": "NRT-3h",
    "s1:processing_level": "1",
    "s1:resolution": "high",
    "s1:orbit_source": "PREORB",
    "s1:slice_number": "16",
    "s1:total_slices": "30",
    "s1:shape": [
      26568,
      16670
    ],
    "datetime": "2022-01-10T05:09:34.899931Z"
  },
  "geometry": {
    "type": "MultiPolygon",
    "coordinates": [
      [
        [
          [
            18.079359729987647,
            52.05903698503639
          ],
          [
            17.59921344332713,
            52.11788627594409
          ],
          [
            17.21819162612652,
            52.16305711390597
          ],
          [
            16.836227427059505,
            52.20701666249193
          ],
          [
            16.644774503911133,
            52.22855383776378
          ],
          [
            16.26118854691523,
            52.27071035665724
          ],
          [
            15.878162494063371,
            52.31148740960032
          ],
          [
            15.685719218551272,
            52.33147845100228
          ],
          [
            15.299480101836377,
            52.37060255563736
          ],
          [
            14.91542044114062,
            52.408192821065235
          ],
          [
            14.463451466835929,
            52.450717160968054
          ],
          [
            14.372309549546433,
            52.09196118819571
          ],
          [
            14.192949197915077,
            51.37404859428371
          ],
          [
            14.100639411215216,
            51.01536740309548
          ],
          [
            14.087055295808,
            50.9560622660824
          ],
          [
            14.520915321006537,
            50.91384578735798
          ],
          [
            14.897047013730848,
            50.87585740692798
          ],
          [
            15.444371500878706,
            50.818284399642415
          ],
          [
            15.82131687021256,
            50.777071435952195
          ],
          [
            16.00860279082098,
            50.75611375092879
          ],
          [
            16.37858130437305,
            50.71376912044361
          ],
          [
            16.753025078610975,
            50.66964403535887
          ],
          [
            17.122949485399943,
            50.62478467617457
          ],
          [
            17.59017818138203,
            50.566284399727365
          ],
          [
            17.78275560129392,
            51.16327037094983
          ],
          [
            17.841239264737652,
            51.34251038359834
          ],
          [
            17.959758632357495,
            51.70085025688251
          ],
          [
            18.079359729987647,
            52.05903698503639
          ]
        ]
      ]
    ]
  },
  "links": [
    {
      "rel": "license",
      "href": "https://sentinel.esa.int/documents/247904/690755/Sentinel_Data_Legal_Notice"
    },
    {
      "rel": "self",
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd-stac/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7.json?sv=2020-10-02&st=2022-02-16T15%3A21%3A45Z&se=2022-02-17T15%3A21%3A45Z&sr=b&sp=r&sig=vzb4ww%2BriTrz7VRAE9U%2Fv3BjqzRjucoce2Kcrjhw9X0%3D",
      "type": "application/json"
    }
  ],
  "assets": {
    "safe-manifest": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/manifest.safe",
      "type": "application/xml",
      "title": "Manifest File",
      "description": "General product metadata in XML format. Contains a high-level textual description of the product and references to all of product's components, the product metadata, including the product identification and the resource references, and references to the physical location of each component file contained in the product.",
      "roles": [
        "metadata"
      ]
    },
    "schema-product-vh": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/annotation/rfi/rfi-iw-vh.xml",
      "type": "application/xml",
      "title": "Product Schema",
      "description": "Describes the main characteristics corresponding to the band: state of the platform during acquisition, image properties, Doppler information, geographic location, etc.",
      "roles": [
        "metadata"
      ]
    },
    "schema-product-vv": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/annotation/rfi/rfi-iw-vv.xml",
      "type": "application/xml",
      "title": "Product Schema",
      "description": "Describes the main characteristics corresponding to the band: state of the platform during acquisition, image properties, Doppler information, geographic location, etc.",
      "roles": [
        "metadata"
      ]
    },
    "schema-calibration-vh": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/annotation/calibration/calibration-iw-vh.xml",
      "type": "application/xml",
      "title": "Calibration Schema",
      "description": "Calibration metadata including calibration information and the beta nought, sigma nought, gamma and digital number look-up tables that can be used for absolute product calibration.",
      "roles": [
        "metadata"
      ]
    },
    "schema-calibration-vv": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/annotation/calibration/calibration-iw-vv.xml",
      "type": "application/xml",
      "title": "Calibration Schema",
      "description": "Calibration metadata including calibration information and the beta nought, sigma nought, gamma and digital number look-up tables that can be used for absolute product calibration.",
      "roles": [
        "metadata"
      ]
    },
    "schema-noise-vh": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/annotation/calibration/noise-iw-vh.xml",
      "type": "application/xml",
      "title": "Noise Schema",
      "description": "Estimated thermal noise look-up tables",
      "roles": [
        "metadata"
      ]
    },
    "schema-noise-vv": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/annotation/calibration/noise-iw-vv.xml",
      "type": "application/xml",
      "title": "Noise Schema",
      "description": "Estimated thermal noise look-up tables",
      "roles": [
        "metadata"
      ]
    },
    "thumbnail": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/preview/quick-look.png",
      "type": "image/png",
      "title": "Preview Image",
      "description": "An averaged, decimated preview image in PNG format. Single polarisation products are represented with a grey scale image. Dual polarisation products are represented by a single composite colour image in RGB with the red channel (R) representing the  co-polarisation VV or HH), the green channel (G) represents the cross-polarisation (VH or HV) and the blue channel (B) represents the ratio of the cross an co-polarisations.",
      "roles": [
        "thumbnail"
      ]
    },
    "vh": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/measurement/iw-vh.tiff",
      "type": "image/tiff; application=geotiff; profile=cloud-optimized",
      "title": "VH",
      "description": "Actual SAR data that have been processed into an image",
      "eo:bands": [
        {
          "name": "VH",
          "description": "VH band: vertical transmit and horizontal receive"
        }
      ],
      "roles": [
        "data"
      ]
    },
    "vv": {
      "href": "https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2022/1/10/IW/DV/S1A_IW_GRDH_1SDV_20220110T050922_20220110T050947_041394_04EBF7_A360/measurement/iw-vv.tiff",
      "type": "image/tiff; application=geotiff; profile=cloud-optimized",
      "title": "VV",
      "description": "Actual SAR data that have been processed into an image",
      "eo:bands": [
        {
          "name": "VV",
          "description": "VV band: vertical transmit and vertical receive"
        }
      ],
      "roles": [
        "data"
      ]
    }
  },
  "bbox": [
    14.087055295808,
    50.566284399727365,
    18.079359729987647,
    52.450717160968054
  ],
  "stac_extensions": [
    "https://stac-extensions.github.io/sar/v1.0.0/schema.json",
    "https://stac-extensions.github.io/sat/v1.0.0/schema.json",
    "https://stac-extensions.github.io/eo/v1.0.0/schema.json"
  ]
}

From the attributes, it's just missing

  • sat:anx_datetime
  • xs:instrument_mode_swaths

For the files, I need to check what exactly is expected. It seems like https://github.com/bopen/xarray-sentinel/blob/afd6fa247ff75a851fc0a7d4e18acddbcd4af625/xarray_sentinel/esa_safe.py#L158-L160 is parsing the swath, polarization, start. Most likely, that information will come from the assets, but I need to confirm that.

So I have two questions:

  1. Would xarray-sentinel be interested in supporting reading from a STAC items? I could see something like xr.open_dataset(stac_item, engine="sentinel-1") and inferring based on the type, or a separate engine like xr.open_dataset(stac_item, engine="sentinel-1-stac").
  2. If we match the output of https://github.com/bopen/xarray-sentinel/blob/afd6fa247ff75a851fc0a7d4e18acddbcd4af625/xarray_sentinel/esa_safe.py#L107 from a STAC item, would the rest of xarray-sentinel likely work? Or might there be other places relying on the manifest.safe file?

TomAugspurger avatar Feb 16 '22 15:02 TomAugspurger

@TomAugspurger the essential pieces of information that xarray-sentinel reads from the manifest.safe file are the swath names, the polarizations identifiers, the names of the annotation and measurement files and a way to associate them to swath and polarization. Anything else is nice-to-have metadata. The only exception is the ascendingNodeTime but it is duplicated in the annotations XML, so no problem as well.

As a matter of fact I plan to address #88 before the next release where I intend to move all attributes from the Dataset to the DataArray and most of the STAC-like attributes to a dedicated group or to an auxiliary variable. This should clear most of the top-level dependencies from the manifest.safe file and make the transition to a STAC item much easier.

To answer your questions:

  1. if the STAC item representation is unique enough, this looks like a very nice enhancement, but I don't see it as a priority at the moment
  2. yes it will (also the definition of the files is arbitrary, so we can make it cleaner if needed)

alexamici avatar Feb 20 '22 18:02 alexamici

Thanks for those details. I'll plan to explore this some time in the next few weeks.

TomAugspurger avatar Feb 20 '22 20:02 TomAugspurger