stac-spec icon indicating copy to clipboard operation
stac-spec copied to clipboard

Mosaic / composite item type

Open mojodna opened this issue 5 years ago • 10 comments

Grouped images should be representable within a STAC Catalog. These may range from multiple parts of a DigitalGlobe strip (multiple assets, but logically and effectively treated as a single asset alongside additional (metadata, thumbnail, etc.) assets) to a curated collection of items that an entity wishes to share.

Properties of an element within such a collection of assets:

  • layer index
  • range of valid resolutions (i.e. zooms) for an individual asset
  • asset validity footprint (potentially represented using quad keys corresponding to Web Mercator tiles at a resolution greater than that of the image)
  • resampling method to use when necessary
  • whether to stretch values to shift into a visible range
  • min/max values for each band when stretching
  • custom NODATA value
  • band selections

In the case of components of a DG strip, this would be included in the list of assets (ideally with some indication that it should be used in preference to individual components, perhaps using q values: "Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.")

A curated collection of items potentially equates to a STAC Item in terms of usage, so this is something to reconcile.

Many of these things provide hints for display purposes and aren't necessarily descriptors for the item itself, so that also merits consideration.

The combination of resolution ranges and quad keys allows a tiler to know when a given asset should be included (and in what order) in a composite image.

We envision this as a small-ish JSON file that is HTTP-accessible and can be used in place of a COG URL (or POSTed) with something like tiles.rdnt.io.

/cc @sharkinsspatial

mojodna avatar Aug 14 '18 05:08 mojodna

@mojodna Most of these fields look to me like they could apply to any EO data - for example NODATA and gain/offset values. This is actually a problem right now when dealing with landsat data since each band has different gain/offsets to get to TOA' (note not true TOA but TOA without sun angle correction), and that info is only available through the MTL metadatata file, not in the STAC item.

matthewhanson avatar Aug 14 '18 11:08 matthewhanson

Yes, none of these are mosaic-specific. We will add them to the raster extension(s) of the Dataset spec.

@mojodna : What's a sample use case for quality factors?

@matthewhanson : For reference, in EE almost all properties from the MTL file are stored in each Landsat asset's metadata, and thus TOA can be computed on the fly. In Collection 1 average sun angle can be taken into account: https://landsat.usgs.gov/using-usgs-landsat-8-product

However, it's not clear if storing them in the STAC catalog is necessary if STAC is intended just for listing/retrieving/visualizing assets and not for providing input for computations.

simonff avatar Aug 20 '18 04:08 simonff

@simonff The problem we have run into is, in the case of Landsat, in order to visualize it you need to apply the gains and offsets and it makes it easier if those gains and offsets are in the STAC record rather than in the datafile because that requires you need to read the metadata from the header of the files which is more overhead (we are reading just windowed pieces of the files remotely from S3, so the overhead of reading additional metadata is not small).

In the case of sun angle for Landsat there are two problems: 1 - this special case of handling Landsat means specific processing code just for Landsat, whereas if it were already in TOA reflectance (or surface reflectance) you can use the same processing code as for Sentinel and other sensors. We're currently working with USGS and pushing for them to distribute it as such because right now many people are using Landsat data incorrectly because they aren't correcting it. 2 - While you can use average sun angle (ie scene center angle) it is not ideal when you visualize two adjacent rows in the same path. You will see an artifact at the scene border. The sun angle really should be calculate per pixel and applied as an array.

matthewhanson avatar Aug 20 '18 08:08 matthewhanson

@simonff Also, while some of these fields do apply to the dataset as a whole, some of them (such as gain/offset) would be per Item as they can change across scenes.

matthewhanson avatar Aug 20 '18 08:08 matthewhanson

👋 @mojodna @matthewhanson I'd love to see this moving.

About the proposed properties, IMO (and for my use cases) the most important is to have the zoom range and the quadkey coverage for each item.

Fee comments about the proposed items:

  • layer index

Not sure what it means

  • range of valid resolutions (i.e. zooms) for an individual asset

👍 (or resolution + number of overviews, if present)

  • asset validity footprint (potentially represented using quad keys corresponding to Web Mercator tiles at a resolution greater than that of the image)

👍 Quadkey is perfect. Having the full list of quadkey might be a bit expensive (in processing/storage/response) so maybe the list of quakdey at the lowest resolution.

  • resampling method to use when necessary

😐 I see this as optional and is implementation specific IMO.

  • whether to stretch values to shift into a visible range

😐 I see this as optional and is implementation specific IMO.

  • min/max values for each band when stretching

😐 I see this as optional and is implementation specific IMO.

  • custom NODATA value

😐 I see this as optional and is implementation specific IMO.

  • band selections

👍

With our recent work on COG mosaics https://medium.com/devseed/cog-talk-part-2-mosaics-bbbf474e66df we use quadkey indexes intermediate files to link a tile request to a COG so having a quadkey info directly in the stac metadata will make it easier to create those.

vincentsarago avatar Jun 05 '19 20:06 vincentsarago

Any more interest in something like this?

I'm interested in an extension that provides for a collection of geotiff files that make up a Mosaic dataset. Looking at the existing data model I think a the existing STAC collection almost provides this. I would be interested in the following additional fields:

  • Thumbnail asset - providing thumbnail link for the whole mosaic. Maybe this can be extended?
  • CRS for the mosaic. Maybe adding to properties once this proj extension (#485) is accepted
  • Metadata asset for the mosaic dataset - e.g ISO XML link
  • Geometry for the footprint of the mosaic dataset

Collection are already supporting the following fields I need:

  • description
  • keywords
  • extent
  • providers
  • licence (including link)
  • links (provide references to all the Geotiff tile items)

palmerj avatar Dec 16 '19 05:12 palmerj

I would also be interesting in having a geometry for the footprint of the mosaic dataset, but understand that might not be possible with Collections not being GeoJSON features.

palmerj avatar Dec 18 '19 06:12 palmerj

Also just came across this catalog layout best practise:

Items should be stored in subdirectories of their parent catalog. This means that each item and its assets are contained in a unique subdirectory

I think in regards to mosaics it best that tile items are not stored in subdirectories. I understand this practise for very large catalogue of datasets that contain one tiff file per band.

palmerj avatar Dec 19 '19 06:12 palmerj

Anyone interested in this?

palmerj avatar Jan 10 '20 00:01 palmerj

@palmerj From previous work on other extensions, it's often a good idea to just start a draft and put it up as PR. Afterwards, we can get people to review it, asking explicitly for help from domain experts and asking for help on Twitter. Asking here may not get you enough attention.

m-mohr avatar Jan 10 '20 10:01 m-mohr

There's an extension now: https://github.com/stac-extensions/composite Please continue the discussion there.

m-mohr avatar Apr 04 '23 16:04 m-mohr