h3 icon indicating copy to clipboard operation
h3 copied to clipboard

Support different polyfill mode

Open boyxgc opened this issue 5 years ago • 5 comments

Currently, the polyfill function only includes hexagons whose center is within the polygon. Would be good to support different polyfill needs:

  1. center within the input polygon
  2. input polygon fully contained
  3. only hexagons fully contained in the input polygon

boyxgc avatar Aug 21 '19 01:08 boyxgc

input polygon fully contained

+1 this issue. Wanted to do something like ^ today. But, later realized about the below line from the documentation. Looked around for any related issue and found this one to be appropriate.

Ref: https://github.com/uber/h3-js#module_h3.polyfill

Get all hexagons with centers contained in a given polygon.

Screenshot 2019-12-26 at 11 26 10 AM

bkowshik-rapido avatar Dec 26 '19 08:12 bkowshik-rapido

Had an interesting case at work today:

  • There are two clusters, colored green and blue.
  • Now, the task was to cover both these clusters with hex8 polygons
  • It so happened that there was this hex8 who's center was in the space between the two clusters 😬
Screenshot 2020-02-10 at 9 18 26 AM

bkowshik-rapido avatar Feb 10 '20 03:02 bkowshik-rapido

Well, those two polygons do overlap. Polygons meant to be adjacent to each other should be using the exact same points on the shared sides, that's the only way the promises of this current polyfill algorithm will be kept.

dfellis avatar Feb 10 '20 17:02 dfellis

For option (2):

  1. input polygon fully contained

I've put together some code to do this (introduces dependency to Shapely):

import json
import h3
from shapely import geometry
from shapely.geometry import Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon
from pyspark.sql import functions as F


def _cover_point_h3(point: Point, resolution: int):
    '''
    Return the set of H3 cells at the specified resolution which completely cover the input point.
    '''
    result_set = set()

    # Hexes for point
    result_set.update(h3.geo_to_h3(t[1], t[0], resolution) for t in list(point.coords))
    return result_set


def _cover_line_h3(line: LineString, resolution: int):
    '''
    Return the set of H3 cells at the specified resolution which completely cover the input line.
    '''
    result_set = set()
    # Hexes for endpoints
    endpoint_hexes = [h3.geo_to_h3(t[1], t[0], resolution) for t in list(line.coords)]
    # Hexes for line (inclusive of endpoints)
    for i in range(len(endpoint_hexes)-1):
        result_set.update(h3.h3_line(endpoint_hexes[i], endpoint_hexes[i+1]))
    return result_set


def _cover_polygon_h3(polygon: Polygon, resolution: int):
    '''
    Return the set of H3 cells at the specified resolution which completely cover the input polygon.
    '''
    result_set = set()
    # Hexes for vertices
    vertex_hexes = [h3.geo_to_h3(t[1], t[0], resolution) for t in list(polygon.exterior.coords)]
    # Hexes for edges (inclusive of vertices)
    for i in range(len(vertex_hexes)-1):
        result_set.update(h3.h3_line(vertex_hexes[i], vertex_hexes[i+1]))
    # Hexes for internal area
    result_set.update(list(h3.polyfill(geometry.mapping(polygon), resolution, geo_json_conformant=True)))
    return result_set


def _cover_shape_h3(shape: geometry.BaseGeometry, resolution: int):
    '''
    Return the set of H3 cells at the specified resolution which completely cover the input shape.

    NOTE: This behavior differs significantly from the default Polyfill behavior.
    This function is aimed at *accuracy* – that is, to never miss a pair of potential matches
    when doing operations such as a spatial join. If you are only looking for approximate matches,
    the default set of functions (h3_to_geo / polyfill) will be far more efficient.
    https://h3geo.org/docs/api/regions/#polyfill
    '''
    result_set = set()

    try:
        if isinstance(shape, Point):
            result_set = result_set.union(_cover_point_h3(shape, resolution))  # noqa

        elif isinstance(shape, LineString):
            result_set = result_set.union(_cover_line_h3(shape, resolution))  # noqa

        elif isinstance(shape, Polygon):
            result_set = result_set.union(_cover_polygon_h3(shape, resolution))  # noqa

        elif isinstance(shape, MultiPoint) or isinstance(shape, MultiLineString) or isinstance(shape, MultiPolygon):
            result_set = result_set.union(*[
                _cover_shape_h3(json.dumps(geometry.mapping(s)), resolution) for s in shape.geoms
            ])
        else:
            raise ValueError(f"Unsupported geometry_type {shape.geom_type}")

    except Exception as e:
        raise ValueError(f"Error finding indices for geometry {json.dumps(geometry.mapping(shape))}", repr(e))

    return list(result_set)

If polyfill is invoked with they hypothetical mode=2 arg, we could call _cover_shape_h3 instead of the default polyfill behavior.

kevinschaich avatar Nov 23 '21 17:11 kevinschaich

This is what we are using which takes a slightly different approach - buffer the polygon by the edge length of a hex at it's centroid and desired resolution so that a hex centre will always be inside the polygon - ie, you get a covering at any resolution regardless of the size of the polygon. Then polyfill as normal. You can end up with some false positives this way but that is ok in our use and can easily be pruned if absolutely required.

This is slightly changed from our code but should work and at least shows the idea.

import statistics
import math
from h3 import h3
import pygeos
from operator import or_
from functools import reduce


def covering_cells_from_geometry(geometry, res):
    """ 
    :param geometry: pygeos geometry
    :param res: desired h3 resolution
    :return: set of h3 indexes
    """
    buffer_width = statistics.mean(
        [
            h3.exact_edge_length(e, unit="rads")
            for e in h3.get_h3_unidirectional_edges_from_hexagon(
                h3.geo_to_h3(
                    *pygeos.get_coordinates(pygeos.centroid(geometry))[0][::-1],
                    resolution=res
                )
            )
        ]
    ) * (180 / math.pi)

    buffered_geometry_parts = pygeos.get_parts(
        pygeos.buffer(geometry, buffer_width, join_style="mitre")
    )

    return reduce(
        or_,
        [
            h3.polyfill_polygon(
                pygeos.coordinates.get_coordinates(p),
                res=res,
                lnglat_order=True,
            )
            for p in buffered_geometry_parts
        ] or [set()],
    )

That won't work with every input geometry type as it stands but the approach works as long as you can get to the 'base' geometry arrays of coordinates.

I believe @dfellis was looking at some alternatives as well and definitely has a better grasp of the math(s) (as appropriate for Australia or US ;-) involved than me but this has worked well for our particular use case so far.

pfw avatar Nov 24 '21 00:11 pfw

The new modes (overlapping and full containment) are added in https://github.com/uber/h3/pull/796

nrabinowitz avatar Nov 27 '23 20:11 nrabinowitz