grass-addons icon indicating copy to clipboard operation
grass-addons copied to clipboard

t.stac.import: Add STAC API import functionality

Open cwhite911 opened this issue 1 year ago • 11 comments

I'm implementing STAC API import functionality into GRASS using pystac-client.

Currently, I'm modeling the module off of the STAC API request parameters:

  • collections – List of one or more Collection IDs
  • ids – List of one or more Item ids to filter on.
  • limit – A recommendation to the service as to the number of items to return per page of results. Defaults to 100.
  • max_items – The maximum number of items to return from the search, even if there are more matching results.
  • bbox – A list, tuple, or iterator representing a bounding box of 2D or 3D coordinates. Results will be filtered to only those intersecting the bounding box.
  • intersects – A string or dictionary representing a GeoJSON geometry, or an object that implements a geo_interface property, as supported by several libraries including Shapely, ArcPy, PySAL, and geojson. Results filtered to only those intersecting the geometry.
  • datatime – Either a single datetime or datetime range used to filter results.
  • query – List or JSON of query parameters as per the STAC API query extension
  • filter – JSON of query parameters as per the STAC API filter extension
  • filter_lang – Language variant used in the filter body. If filter is a dictionary or not provided, defaults to ‘cql2-json’. If filter is a string, defaults to cql2-text.

Details docs at https://pystac-client.readthedocs.io

I'm thinking we should use the current computational region as the default bbox to filter the request and provide an option to define a raster or vector to use as the intersects parameter.

For outputs, we have a few options.

  1. Import all items individually.
  2. Import a single patched raster.
  3. Import as a STRDS.

Another option, instead of having a single module r.in.stac, we could split it into multiple modules and include specific STAC extension parameters in each:

  • r.in.stac
  • i.in.stac
  • v.in.stac

Thoughts...

cwhite911 avatar Sep 30 '22 18:09 cwhite911

When importing parts of a space-time collection, the corresponding output would be a GRASS space-time dataset, created with a t.* module, e.g. t.in.stac.

metzm avatar Oct 04 '22 07:10 metzm

When importing parts of a space-time collection, the corresponding output would be a GRASS space-time dataset, created with a t.* module, e.g. t.in.stac.

What about simplifying all into:

  • t.rast.in.stac or t.in.stac.rast
  • t.vect.in.stac or t.in.stac.vect

or a toolset t.in.stac with raster and vector options as submodules?

STAC stands for spatio-temporal after all, and that's the beauty of it, no? In any case, big 👍 for this!!

veroandreo avatar Oct 04 '22 11:10 veroandreo

When importing parts of a space-time collection, the corresponding output would be a GRASS space-time dataset, created with a t.* module, e.g. t.in.stac.

What about simplifying all into:

  • t.rast.in.stac or t.in.stac.rast
  • t.vect.in.stac or t.in.stac.vect

or a toolset t.in.stac with raster and vector options as submodules?

STAC stands for spatio-temporal after all, and that's the beauty of it, no? In any case, big 👍 for this!!

Agreed. As for module naming, there is t.rast.import and t.vect.import in core, so it might be more consistent with existing approaches to name modules t.rast.import.stac and t.vect.import.stac (though three dots are not standard either)...

P.S.: Sorry @veroandreo for accidentally editing your comment...

ninsbl avatar Oct 05 '22 08:10 ninsbl

Thanks @veroandreo and @ninsbl for the feedback. We could also call the module t.stac.import with a raster or vector option that uses the sub-modules t.in.stac.rast or t.in.stac.vec in the background.

Another thing to consider is the t.out.stac. I've started a proposal for a GRASS STAC extension using the nc_spm_08 sample dataset. My thought is that I would like to be able to describe a grassdata directory as a STAC Catalog that contains Location STAC Collections that contain mapset STAC collections that have assets. This would allow sharing GRASS datasets as explorable metadata that can be viewed in any STAC Viewer or OpenPlains.

  • GRASS STAC Extenion: https://github.com/tomorrownow/grass-stac-extension

Here are some initial tests, but I'm still working on the first spec and want to get wider community feedback once the initial use cases are flushed out.

grassdata: STAC Catalog

{
    "type": "Catalog",
    "id": "grassdata",
    "stac_version": "1.0.0",
    "description": "GRASS GIS STAC data catalog used by OpenPlains.",
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "child",
        "href": "https://example.com/grass_catalog/nc_spm_08/collection.json",
        "type": "application/json",
        "title": "nc_spm_08"
      },
      {
        "rel": "self",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      }
    ],
    "stac_extensions": []
  }

nc_spm_08 location: STAC Collection

{
    "type": "Collection",
    "id": "nc_spm_08",
    "stac_version": "1.0.0",
    "description": "GRASS GIS Sample Datasets",
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "child",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/collection.json",
        "type": "application/json",
        "title": "PERMANENT"
      },
      {
        "rel": "parent",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      }
    ],
    "stac_extensions": [
      "https://stac-extensions.github.io/projection/v1.0.0/schema.json",
      "https://stac-extensions.github.io/scientific/v1.0.0/schema.json"
    ],
    "grass:type": "location",
    "proj:epsg": 3358,
    "sci:citation": "GRASS Development Team, 2022. Geographic Resources Analysis Support System (GRASS) Software, Version 8.0. Open Source Geospatial Foundation. https://grass.osgeo.org",
    "title": "nc_spm_08",
    "extent": {
      "bbox": [
        [
          33.83,
          -84.33,
          36.59,
          -75.38
        ]
      ]
    },
    "license": "GNU General Public License (GPL)",
    "keywords": [
      "GRASS GIS",
      "Location"
    ]
  }

PERMANENT mapset: STAC Collection

{
    "type": "Collection",
    "id": "PERMANENT",
    "stac_version": "1.0.0",
    "description": "defualt mapset",
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "item",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/elevation/elevation.json",
        "type": "application/json"
      },
      {
        "rel": "parent",
        "href": "https://example.com/grass_catalog/nc_spm_08/collection.json",
        "type": "application/json",
        "title": "nc_spm_08"
      }
    ],
    "stac_extensions": [
      "https://stac-extensions.github.io/projection/v1.0.0/schema.json",
      "https://stac-extensions.github.io/scientific/v1.0.0/schema.json"
    ],
    "grass:type": "mapset",
    "proj:epsg": 3358,
    "sci:citation": "GRASS Development Team, 2022. Geographic Resources Analysis Support System (GRASS) Software, Version 8.0. Open Source Geospatial Foundation. https://grass.osgeo.org",
    "title": "PERMANENT",
    "extent": {
      "bbox": [
        [
          33.83,
          -84.33,
          36.59,
          -75.38
        ]
      ]
    },
    "license": "GNU General Public License (GPL)",
    "keywords": [
      "GRASS GIS",
      "mapset",
      "PERMANENT"
    ]
 }

Raster data: STAC Item

{
    "type": "Feature",
    "stac_version": "1.0.0",
    "id": "elevation",
    "properties": {
      "title": "\"South-West Wake county: Elevation NED 10m\"",
      "description": "\"generated by r.proj\"",
      "proj:epsg": 3358,
      "grass:datatype": "FCELL",
      "grass:comments": "\"r.proj input=\"ned03arcsec\" location=\"northcarolina_latlong\" mapset=\"\\helena\" output=\"elev_ned10m\" method=\"cubic\" resolution=10\"",
      "grass:creator": "\"helena\"",
      "grass:ewres": "10",
      "grass:nsres": "10",
      "grass:cols": "1500",
      "grass:location": "nc_spm_08",
      "grass:mapset": "PERMANENT",
      "grass:map": "elevation",
      "grass:maptype": "raster",
      "grass:min": "55.57879",
      "grass:max": "156.3299",
      "grass:ncats": "255",
      "grass:semantic_label": "\"none\"",
      "grass:source1": "\"\"",
      "grass:source2": "\"\"",
      "datetime": "2006-07-11T01:09:51Z"
    },
    "geometry": {
      "type": "Polygon",
      "coordinates": [
        [
          [
            645000.0,
            215000.0
          ],
          [
            645000.0,
            228500.0
          ],
          [
            630000.0,
            228500.0
          ],
          [
            630000.0,
            215000.0
          ],
          [
            645000.0,
            215000.0
          ]
        ]
      ]
    },
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "collection",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/collection.json",
        "type": "application/json",
        "title": "PERMANENT"
      },
      {
        "rel": "parent",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/collection.json",
        "type": "application/json",
        "title": "PERMANENT"
      }
    ],
    "assets": {
      "raster": {
        "href": "/api/v3/locations/nc_spm_08/mapsets/PERMANENT/raster_layers/elevation",
        "type": "image/tiff; application=geotiff; profile=cloud-optimized",
        "title": "\"South-West Wake county: Elevation NED 10m\"",
        "roles": [
          "data"
        ]
      },
      "thumbnail": {
        "href": "/api/v3/locations/nc_spm_08/mapsets/PERMANENT/raster_layers/elevationrender",
        "type": "image/png",
        "title": "\"South-West Wake county: Elevation NED 10m\" Thumbnail",
        "roles": [
          "thumbnail"
        ]
      }
    },
    "bbox": [
        630000,
        215000,
        645000,
        228500
    ],
    "stac_extensions": [],
    "collection": "PERMANENT"
}

cwhite911 avatar Nov 12 '22 01:11 cwhite911

@cwhite911 do you have any updates on this?

This module could be really useful

lucadelu avatar Apr 27 '23 12:04 lucadelu

@cwhite911 any plans to continue with this soon(ish)? Now that ESA will change data delivery, having access via STAC would be really relevant for many

veroandreo avatar Jun 05 '23 10:06 veroandreo

Please preserve your shell and Git history. I want to see what happened to change the instructions accordingly.

wenzeslaus avatar Sep 19 '23 01:09 wenzeslaus

Really good to see this moving forward, especially with: https://dataspace.copernicus.eu/ providing Sentinel data in STAC format...

ninsbl avatar Sep 19 '23 13:09 ninsbl

This PR needs a big rebase...

neteler avatar Sep 19 '23 15:09 neteler

This PR needs a big rebase...

Fixed the issue

cwhite911 avatar Sep 19 '23 17:09 cwhite911

Now, I finally understood (thanks @cwhite911!) how this thing with many commits happens. You have outdated branch. You decide to update it to the base branch (here grass8) by rebase. You do that. Then you push. You get a message that the remote branch and local branch diverged and that the push is not possible. The suggestion is to update you local branch. You decide to update the local branch from the remote one by rebase. And that's where the mess happens. All the perfectly fine commits from grass8 branch which are now on your local branch get removed and then re-applied on top of the latest commit on the remote branch. Then you happily push and then see all these extra commits duplicating changes already on the base branch.

The right operation after rebasing the local branch to the base branch is to force push. You changed all the commits on the local branch and that's what you want on the remote one too. Force push is the right operation here because you want to replace remote branch with what you have locally.

Git does not know that's what you are doing. It sees different commit hashes and it gives you advice which would preserve all these commits.

It is worth noting that, unlike merge, rebase changes the commit hashes. So, even the same change after a rebase, has a different hash, so it looks like a different commit to Git.

In light of Git giving the "wrong advice" in the rebase workflow, it might a good idea to use merge in all contributor workflows. I mean to have it in the instructions. There is no reason not to use rebase in general. One issue with this is terminology, merge can by misleading as it is also used in "merge a PR".

Sorry, for an off-topic post here, I'll follow up on this in a new issue or PR.

wenzeslaus avatar Sep 19 '23 18:09 wenzeslaus

@cwhite911 you know you can go to the "files changed" tab in the PR and add multiple changes in a batch for a single commit. (If it was suggestions in the review)

echoix avatar Jul 16 '24 16:07 echoix

@cwhite911 you know you can go to the "files changed" tab in the PR and add multiple changes in a batch for a single commit. (If it was suggestions in the review)

I didn't notice the batch command for accepting multiple suggestions, but I will use it next time so I don't overload the CI.

cwhite911 avatar Jul 16 '24 18:07 cwhite911

@cwhite911 you know you can go to the "files changed" tab in the PR and add multiple changes in a batch for a single commit. (If it was suggestions in the review)

I didn't notice the batch command for accepting multiple suggestions, but I will use it next time so I don't overload the CI.

CI gets cancelled anyways, but it's maybe the emails of subscribed people, or just scrolling through commits

echoix avatar Jul 16 '24 19:07 echoix