geoparquet icon indicating copy to clipboard operation
geoparquet copied to clipboard

Correct the 'bbox' description for 3D geometries

Open jorisvandenbossche opened this issue 2 years ago • 11 comments

In https://github.com/opengeospatial/geoparquet/pull/45 we explicitly allowed 3D geometries, but in that PR I forgot to update the bbox description to reflect this.

jorisvandenbossche avatar Apr 28 '22 12:04 jorisvandenbossche

@Jesus89 do you know how the json schema needs to be updated for this? (now it harcoded 4 items, but it needs to be 4 or 6)

jorisvandenbossche avatar Apr 28 '22 12:04 jorisvandenbossche

I think we can use oneOf and allow both options.

If at some point, we include an explicit param for 3D, the If-Then-Else condition could be used too.

Jesus89 avatar Apr 28 '22 14:04 Jesus89

A proposal:

{
  "bbox": {
    "type": "array",
    "description": "Bounding Box of the geometries in the file, formatted according to RFC 7946, section 5.",
    "oneOf": [
      {
        "items": [
          {
            "type": "number",
            "description": "The westmost constant longitude line that bounds the rectangle (xmin)."
          },
          {
            "type": "number",
            "description": "The minimum constant latitude line that bounds the rectangle (ymin)."
          },
          {
            "type": "number",
            "description": "The eastmost constant longitude line that bounds the rectangle (xmax)."
          },
          {
            "type": "number",
            "description": "The maximum constant latitude line that bounds the rectangle (ymax)."
          }
        ],
        "minItems": 4,
        "additionalItems": false
      },
      {
        "items": [
          {
            "type": "number",
            "description": "The westmost constant longitude line that bounds the parallelepiped (xmin)."
          },
          {
            "type": "number",
            "description": "The minimum constant latitude line that bounds the parallelepiped (ymin)."
          },
          {
            "type": "number",
            "description": "The minimum constant altitude line that bounds the parallelepiped (zmin)."
          },
          {
            "type": "number",
            "description": "The eastmost constant longitude line that bounds the parallelepiped (xmax)."
          },
          {
            "type": "number",
            "description": "The maximum constant latitude line that bounds the parallelepiped (ymax)."
          },
          {
            "type": "number",
            "description": "The maximum constant altitude line that bounds the parallelepiped (zmax)."
          }
        ],
        "minItems": 6,
        "additionalItems": false
      }
    ]
  }
}

Jesus89 avatar Apr 28 '22 14:04 Jesus89

Note that we should review the items' descriptions because of the CRS.

Jesus89 avatar Apr 28 '22 15:04 Jesus89

A proposal:

~~It looks like you're trying to use items as a tuple. Does that work in draft 7?~~ Ah I see, your usage is correct. It changed from items to prefixItems in draft 2019-09 https://json-schema.org/understanding-json-schema/reference/array.html#tuple-validation

Given that we split up the 2D and 3D case in the schema, it probably wouldn't hurt to set maxItems: 4 and maxItems: 6 respectively? Specifically I wonder, if you had [0, 0, 0, 0, 'a', 'a'], could that match the first branch, because it has four numbers and a minimum of four elements? (This would be a good case for a future test suite 🙂 )

Note that we should review the items' descriptions because of the CRS.

+1. Also, I feel like we need to discuss or document whether this bbox is in OGC:WGS84 or in the native CRS of the data. Right now the spec says:

OPTIONAL Bounding Box of the geometries in the file, formatted according to RFC 7946, section 5.

But looking at RFC 7946, section 5, that says:

The axes order of a bbox follows the axes order of geometries.

If I interpret that correctly, if data is saved as EPSG:4326, that means the order of the bbox would be

[minLatitude, minLongitude, maxLatitude, maxLongitude]

which is contrary to the JSON schema at least. And I think most often bboxes are [minLongitude, minLatitude, maxLongitude, maxLatitude], so we should at least document what behavior we expect.

kylebarron avatar Apr 28 '22 15:04 kylebarron

Per some of the CRS related discussions, and a desire to support non CRS-aware readers, it seems important that the axis order of the bounds object remains constant regardless of CRS: [xmin, ymin, xmax, ymax] or [xmin, ymin, zmin, xmax, ymax, zmax]. So it seems the right thing to do is remove the reference to the RFC, and simply state the expected order.

brendan-ward avatar Apr 28 '22 16:04 brendan-ward

Our specification is very explicit about axis order, which is always the same regardless of the CRS you use:

https://github.com/opengeospatial/geoparquet/blob/a3c1212b59bce9ea9220897040bf38e3d6381fff/format-specs/geoparquet.md?plain=1#L124-L128

So I don't think there is any ambiguity for the bbox following "axis order of the geometries", as this is always the same.

But it would indeed be useful to clarify that the bbox is always in the same coordinate system as the data.

jorisvandenbossche avatar Apr 28 '22 16:04 jorisvandenbossche

And the reference to the GeoJSON spec is I think still useful to refer to for the handling of antimeridian, poles, etc

jorisvandenbossche avatar Apr 28 '22 16:04 jorisvandenbossche

The discussion is a bit split, but see my last comment at https://github.com/opengeospatial/geoparquet/issues/92#issuecomment-1130174010. So I am not fully sure myself this PR should be merged as is in its current state (requiring 3D bbox for 3D data).

jorisvandenbossche avatar May 24 '22 06:05 jorisvandenbossche

Ah, I was remembering that comment but then couldn't figure out where I saw it.

I think if we're not feeling sure about it then we should just hold off for now. See how 3d data in geoparquet plays out and wait for someone to make the case. I'll push out the milestone on this one.

cholmes avatar May 24 '22 14:05 cholmes

This PR creates an inconsistency, since the referenced GeoJSON specification does explicitly ask for the bbox to include all used dimensions. To prevent confusion it might be good to explicitly call out that deviation.

wichert avatar Sep 16 '22 10:09 wichert

Consensus on call 9/24 seems to be we should just align with GeoJSON and do the same thing they do. @jorisvandenbossche to update the schema.

cholmes avatar Oct 24 '22 16:10 cholmes

I think the schema can be greatly simplified as the descriptions are just annotations anyway and are not used for validation. I assume 2D and 3D is allowed. So something like:

{
	"description": "Bounding Box of the geometries in the file, formatted according to RFC 7946, section 5.",
	"type":"array",
	"items":{
		"type":"number"
	},
	"oneOf":[
		{
			"description": "2D bbox consisting of (xmin, ymin, xmax, ymax)",
			"minItems":4,
			"maxItems":4
		},
		{
			"description": "3D bbox consisting of (xmin, ymin, zmin, xmax, ymax, zmax)",
			"minItems":6,
			"maxItems":6
		}
	]
}

m-mohr avatar Oct 25 '22 15:10 m-mohr

GeoJSON does not require that geometries have only 2 or 3 dimensions. It recommends avoiding more than 3, but doesn't forbid it (see the "SHOULD NOT" in Section 3.1.1.).

In addition, the bbox member is required to have the range of values for all dimensions (not just the first 2 or 3). Is it intentional that GeoParquet adds an additional restriction on top of this (allowing only 4 or 6 item bboxes)?

Update: suggested change in #144

tschaub avatar Nov 18 '22 18:11 tschaub

In addition, the bbox member is required to have the range of values for all dimensions (not just the first 2 or 3). Is it intentional that GeoParquet adds an additional restriction on top of this (allowing only 4 or 6 item bboxes)?

We only allow 2D or 3D geometries, at the moment (although that is a bit hidden in the description of the "encoding" keyword):

https://github.com/opengeospatial/geoparquet/blob/085bb6f9fb50b31e38b4d711d867c5366598108f/format-specs/geoparquet.md?plain=1#L113

So that means in practice you can only have 4 or 6 values for the bbox?

jorisvandenbossche avatar Nov 18 '22 20:11 jorisvandenbossche

We only allow 2D or 3D geometries, at the moment (although that is a bit hidden in the description of the "encoding" keyword)

I completely missed that.

tschaub avatar Nov 18 '22 21:11 tschaub