geozarr-spec icon indicating copy to clipboard operation
geozarr-spec copied to clipboard

Define minimum requirements for qualifying a GeoZarr dataset

Open emmanuelmathot opened this issue 4 months ago • 2 comments

Background

During recent discussions, it was identified that the GeoZarr specification lacks clear definition of minimum requirements that qualify a dataset as a GeoZarr dataset. While the unified data model section provides a comprehensive framework, we need to establish a baseline for what constitutes a valid GeoZarr dataset.

Current State

From the meeting discussion:

  • A GeoZarr dataset is currently understood as a root group containing a set of data array variables
  • Implicit understanding that it cannot be an empty group
  • No clear specification of minimum required elements

Issues to Address

  1. What are the minimal required components that make a Zarr dataset qualify as a GeoZarr dataset?
  2. Should we specify minimum metadata requirements?
  3. Are there mandatory coordinate system requirements?
  4. How do we handle partial implementations of the unified data model?

Relevant Specification Sections

The unified data model section defines datasets as having:

  • Dimensions
  • Coordinate Variables
  • Data Variables
  • Attributes

However, it doesn't specify which of these are mandatory for GeoZarr compliance.

Proposal for Discussion

Consider establishing minimum requirements such as:

  1. Mandatory Components:

    • At least one data variable
    • Associated coordinate reference system information
    • Basic spatial metadata
  2. Optional but Recommended:

    • CF Convention compliance
    • STAC metadata
    • Multi-resolution overview support
  3. Clear conformance levels:

    • Core compliance (minimum requirements)
    • Extended compliance (additional features)

Questions for Discussion

  1. What should be the absolute minimum requirements for a valid GeoZarr dataset?
  2. How do we balance flexibility vs standardization?
  3. Should we define different conformance levels?
  4. How do we handle backwards compatibility?

Next Steps

  1. Gather community feedback on minimum requirements
  2. Draft specific changes to the specification
  3. Update conformance classes accordingly
  4. Provide clear examples of minimal valid GeoZarr datasets

Please share your thoughts and suggestions on these points.

cc @briannapagan, @maxrjones, @d-v-b, @negin513, @jbusecke, @vincentsarago

emmanuelmathot avatar Aug 22 '25 06:08 emmanuelmathot

Equivalent to 1) above:

a) compact representation of the xy coordinates (i.e. bbox, or affine transform, or their "zero" equivalents 0,ncol,0,nrow)

b) crs of that xy space (if describe-able, as WKT, AUTH:code, or proj(json))

/end

mdsumner avatar Aug 26 '25 05:08 mdsumner

Seems that this has been already discussed extensively here too: https://github.com/zarr-developers/geozarr-spec/issues/20

emmanuelmathot avatar Sep 01 '25 21:09 emmanuelmathot