h3
h3 copied to clipboard
Support different polyfill mode
Currently, the polyfill function only includes hexagons whose center is within the polygon. Would be good to support different polyfill needs:
- center within the input polygon
- input polygon fully contained
- only hexagons fully contained in the input polygon
input polygon fully contained
+1 this issue. Wanted to do something like ^ today. But, later realized about the below line from the documentation. Looked around for any related issue and found this one to be appropriate.
Ref: https://github.com/uber/h3-js#module_h3.polyfill
Get all hexagons with centers contained in a given polygon.

Had an interesting case at work today:
- There are two clusters, colored green and blue.
- Now, the task was to cover both these clusters with
hex8
polygons - It so happened that there was this
hex8
who's center was in the space between the two clusters 😬

Well, those two polygons do overlap. Polygons meant to be adjacent to each other should be using the exact same points on the shared sides, that's the only way the promises of this current polyfill algorithm will be kept.
For option (2):
- input polygon fully contained
I've put together some code to do this (introduces dependency to Shapely):
import json
import h3
from shapely import geometry
from shapely.geometry import Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon
from pyspark.sql import functions as F
def _cover_point_h3(point: Point, resolution: int):
'''
Return the set of H3 cells at the specified resolution which completely cover the input point.
'''
result_set = set()
# Hexes for point
result_set.update(h3.geo_to_h3(t[1], t[0], resolution) for t in list(point.coords))
return result_set
def _cover_line_h3(line: LineString, resolution: int):
'''
Return the set of H3 cells at the specified resolution which completely cover the input line.
'''
result_set = set()
# Hexes for endpoints
endpoint_hexes = [h3.geo_to_h3(t[1], t[0], resolution) for t in list(line.coords)]
# Hexes for line (inclusive of endpoints)
for i in range(len(endpoint_hexes)-1):
result_set.update(h3.h3_line(endpoint_hexes[i], endpoint_hexes[i+1]))
return result_set
def _cover_polygon_h3(polygon: Polygon, resolution: int):
'''
Return the set of H3 cells at the specified resolution which completely cover the input polygon.
'''
result_set = set()
# Hexes for vertices
vertex_hexes = [h3.geo_to_h3(t[1], t[0], resolution) for t in list(polygon.exterior.coords)]
# Hexes for edges (inclusive of vertices)
for i in range(len(vertex_hexes)-1):
result_set.update(h3.h3_line(vertex_hexes[i], vertex_hexes[i+1]))
# Hexes for internal area
result_set.update(list(h3.polyfill(geometry.mapping(polygon), resolution, geo_json_conformant=True)))
return result_set
def _cover_shape_h3(shape: geometry.BaseGeometry, resolution: int):
'''
Return the set of H3 cells at the specified resolution which completely cover the input shape.
NOTE: This behavior differs significantly from the default Polyfill behavior.
This function is aimed at *accuracy* – that is, to never miss a pair of potential matches
when doing operations such as a spatial join. If you are only looking for approximate matches,
the default set of functions (h3_to_geo / polyfill) will be far more efficient.
https://h3geo.org/docs/api/regions/#polyfill
'''
result_set = set()
try:
if isinstance(shape, Point):
result_set = result_set.union(_cover_point_h3(shape, resolution)) # noqa
elif isinstance(shape, LineString):
result_set = result_set.union(_cover_line_h3(shape, resolution)) # noqa
elif isinstance(shape, Polygon):
result_set = result_set.union(_cover_polygon_h3(shape, resolution)) # noqa
elif isinstance(shape, MultiPoint) or isinstance(shape, MultiLineString) or isinstance(shape, MultiPolygon):
result_set = result_set.union(*[
_cover_shape_h3(json.dumps(geometry.mapping(s)), resolution) for s in shape.geoms
])
else:
raise ValueError(f"Unsupported geometry_type {shape.geom_type}")
except Exception as e:
raise ValueError(f"Error finding indices for geometry {json.dumps(geometry.mapping(shape))}", repr(e))
return list(result_set)
If polyfill
is invoked with they hypothetical mode=2
arg, we could call _cover_shape_h3
instead of the default polyfill behavior.
This is what we are using which takes a slightly different approach - buffer the polygon by the edge length of a hex at it's centroid and desired resolution so that a hex centre will always be inside the polygon - ie, you get a covering at any resolution regardless of the size of the polygon. Then polyfill as normal. You can end up with some false positives this way but that is ok in our use and can easily be pruned if absolutely required.
This is slightly changed from our code but should work and at least shows the idea.
import statistics
import math
from h3 import h3
import pygeos
from operator import or_
from functools import reduce
def covering_cells_from_geometry(geometry, res):
"""
:param geometry: pygeos geometry
:param res: desired h3 resolution
:return: set of h3 indexes
"""
buffer_width = statistics.mean(
[
h3.exact_edge_length(e, unit="rads")
for e in h3.get_h3_unidirectional_edges_from_hexagon(
h3.geo_to_h3(
*pygeos.get_coordinates(pygeos.centroid(geometry))[0][::-1],
resolution=res
)
)
]
) * (180 / math.pi)
buffered_geometry_parts = pygeos.get_parts(
pygeos.buffer(geometry, buffer_width, join_style="mitre")
)
return reduce(
or_,
[
h3.polyfill_polygon(
pygeos.coordinates.get_coordinates(p),
res=res,
lnglat_order=True,
)
for p in buffered_geometry_parts
] or [set()],
)
That won't work with every input geometry type as it stands but the approach works as long as you can get to the 'base' geometry arrays of coordinates.
I believe @dfellis was looking at some alternatives as well and definitely has a better grasp of the math(s) (as appropriate for Australia or US ;-) involved than me but this has worked well for our particular use case so far.
The new modes (overlapping and full containment) are added in https://github.com/uber/h3/pull/796