h3 icon indicating copy to clipboard operation
h3 copied to clipboard

Max edge length in meters

Open mhconradt opened this issue 3 years ago • 6 comments
trafficstars

I'm considering using H3 polyfill to find some data within a region (Polygon). However, it's necessary to include all of the data, not just H3 indexes with centroids in the region. First and foremost, is there a recommended way of doing this? My plan is to buffer the region by the maximum edge length (in meters/km) at the H3 resolution used in my dataset to ensure all cells that could include data within that region are included. To accomplish this, I would need to know the maximum edge length at the resolution. It would be best to have a function that computes this at a given resolution, like this in S2. It would also be useful to publish this along with the other cell statistics. So far I've taken a Monte Carlo 🎲 approach (generating indexes random latitude and longitude and finding the maximum distance between the index and its neighbors, then finding the maximum of that over millions of random points) and it seems the max length is stable around 31.596 meters for resolution 11, and 83.595 for resolution 10. I'm enjoying the fact that 83.595 / 31.596 is almost identical to the square root of 7.

mhconradt avatar May 25 '22 16:05 mhconradt

We do not expose max edge length in the library (though perhaps we should), but you can see related work here: https://observablehq.com/@nrabinowitz/h3-area-stats

The min/max values here are based on knowledge of the grid projection on each face of the icosahedron - the max is found in the very center of the face, the min in the corners. This yields 0.031596 km for res 11, and 0.083595 km for res 10, so I think you've just offered an observational proof of this approach 😁. Note that these numbers assume a spherical model of the earth, and could be slightly off for a true geoid.

As far as approaches go, the two basic approaches are to either expand your polygon, or expand your list of candidate cells. If you don't care about false positives, your approach should work, but I think in some cases it could pull in cells that do not strictly intersect the polygon (especially as there are different approaches to buffering). Another option is cell-based:

  • Polyfill the shape. Call the output polyfillSet
  • Trace all edges of the shape by sampling each line segment and finding the cells for each sample point. Call the output edgeSet.
  • Add the immediate neighbors of all cells in edgeSet to edgeSet, using kRing(cell, 1)
  • Test all cells in edgeSet, including neighbors, for intersection with the original polygon (usually leveraging a 3rd-party geometry library). Add those cells that intersect to polyfillSet.

One nice thing about this approach is that you can adjust the inclusion criteria according to your use case, e.g. to include all cells with > 50% area in the polygon.

nrabinowitz avatar May 25 '22 21:05 nrabinowitz

@nrabinowitz The algorithm you describe won't work if the original polygon has regions that are too small/narrow to contain any cell centroids.

warnes avatar Jul 20 '23 16:07 warnes

@nrabinowitz The algorithm you describe won't work if the original polygon has regions that are too small/narrow to contain any cell centroids.

I don't think that's true? Tracing the edges by sampling ensures that you have cells in every region of the polygon, assuming your sampling is frequent enough.

nrabinowitz avatar Jul 21 '23 19:07 nrabinowitz

Sorry @nrabinowitz, I misread your algorithm. I had assumed you traced the edges of polyfillSet rather than tracing the edges of shape.

warnes avatar Jul 21 '23 19:07 warnes

I suppose it would take some experimentation to determine how fine to sample the points...

warnes avatar Jul 21 '23 19:07 warnes

I suppose it would take some experimentation to determine how fine to sample the points...

Generally I've used the edge length at res. We provide a function to return the average edge length, but as edge length varies across the globe it's somewhat better to take one or more cells at polygons, use originToDirectedEdges to get the edges of these cells, use edgeLength to find the lengths of these edges, then take the smallest length as your sample distance.

nrabinowitz avatar Jul 21 '23 20:07 nrabinowitz