stac-spec icon indicating copy to clipboard operation
stac-spec copied to clipboard

Do we still recommend including only search-relevant fields?

Open philvarner opened this issue 7 months ago • 2 comments

item-spec.md states:

Providers should include metadata fields that are relevant for users of STAC, but it is recommended to [select only those necessary for search](https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#field-selection-and-metadata-linking).

Is this still what we recommend? I don't think it is, and in every implementation I've ever done, we've included far more metadata than we intend to search on.

philvarner avatar Apr 23 '25 13:04 philvarner

Good question. I do like the recommendation to only include fields that are useful (especially for self-describing file formats where the metadata already exists within the dataset itself). I think that the main uses for item-level metadata are:

  • fields that are useful for search
  • fields that might not be included within the data file itself (I'm thinking about provenance for instance or Provider)
  • fields that enable exploration of the dataset before opening it (maybe units or something) not sure how useful this one is
  • fields that explain how to access the data (for instance the xarray extension)

It feels like if you know enough about the dims. there might also be a way to do lazy concatenation or mosaicing with sufficient metadata where you don't need to access the dataset at all in the construction of data cubes, but I'm not sure if that is realistic right now.

jsignell avatar Apr 23 '25 13:04 jsignell

Good question, I think we should keep something along the line to have something to point people to when they just dump everything they have without proper conversion to STAC into their STAC-like JSONs. Practically we do more nowadays then search, but I still encourage people to provide the Item Properties in a form which is easily searchable and restrict it to anything search related and whatever is needed for the most common usecases, but not to cater for every niche usecase...

m-mohr avatar Apr 24 '25 21:04 m-mohr