h3 icon indicating copy to clipboard operation
h3 copied to clipboard

OGC-DGGRS-API feedback. New function getGeometricSubZones

Open jsorel opened this issue 1 month ago • 8 comments

Hello,

There is an ongoing pilot project at the OGC called : AI-DGGS for Disaster Management Pilot. As part of this pilot, multiple DGGRS (H3, S2, Healpix and many more) are used handle by client and server for the new OGC DGGRS API.

During this effort we discovered a missing functionality in H3 which prevented it from being a proper DGGRS (as in the OGC definition). The ECERE and Geomatys companies developed this missing feature, which we now give to the H3 project. Authors : Jérôme Saint-Louis (ECERE) and Johann Sorel (Geomatys)

The new function is called : getGeometricSubZones(ancestor_zone, relative_depth)

It's purpose is to return all zones at a given depth that intersect a parent zone. The zone ids order is also constant, which allow to store only two values (the parent zone and the depth) instead of a long list of zone ids. Unlike the getChildren function, there is no gap on the border, and no fractal effect.

Here is a visual example :

Image

To compare the result with the existing getChildren function :

Image

Python version :

#!/usr/bin/python3
import h3
import math
import time
from multiprocessing import Pool, cpu_count

def slerp(start_lat, start_lng, end_lat, end_lng, t):
    """
    Performs spherical linear interpolation between two geographic points
    with optimizations to compute trigonometric values only once.
    Assumes a perfect sphere model for simplicity.
    """
    # Convert to radians once
    start_lat_rad = math.radians(start_lat)
    start_lng_rad = math.radians(start_lng)
    end_lat_rad = math.radians(end_lat)
    end_lng_rad = math.radians(end_lng)

    # Pre-calculate trigonometric values
    cos_start_lat = math.cos(start_lat_rad)
    sin_start_lat = math.sin(start_lat_rad)
    cos_end_lat = math.cos(end_lat_rad)
    sin_end_lat = math.sin(end_lat_rad)
    delta_lng = end_lng_rad - start_lng_rad

    # Calculate omega (angular distance)
    cos_omega = sin_start_lat * sin_end_lat + cos_start_lat * cos_end_lat * math.cos(delta_lng)

    # Clamp cos_omega to [-1, 1] due to floating point inaccuracies
    cos_omega = max(-1.0, min(1.0, cos_omega))
    omega = math.acos(cos_omega)

    # Handle the case where the two points are the same
    if omega == 0.0:
        return (start_lat, start_lng)

    # Calculate sin(omega) only once
    sin_omega = math.sin(omega)

    # Calculate the blend factors
    a = math.sin((1.0 - t) * omega) / sin_omega
    b = math.sin(t * omega) / sin_omega

    # Calculate interpolated coordinates
    x = a * cos_start_lat * math.cos(start_lng_rad) + b * cos_end_lat * math.cos(end_lng_rad)
    y = a * cos_start_lat * math.sin(start_lng_rad) + b * cos_end_lat * math.sin(end_lng_rad)
    z = a * sin_start_lat + b * sin_end_lat

    lat = math.atan2(z, math.sqrt(x*x + y*y))
    lng = math.atan2(y, x)

    return (math.degrees(lat), math.degrees(lng))

def zoneHasSubZone(ancestor_zone, candidate_subzone):
    """
    Checks if a candidate H3 index is geometrically contained within an
    ancestral H3 index using a vertex-based boundary check with SLERP.
    """
    if h3.get_resolution(candidate_subzone) <= h3.get_resolution(ancestor_zone):
        return False

    ancestor_resolution = h3.get_resolution(ancestor_zone)
    centroid_lat, centroid_lng = h3.cell_to_latlng(candidate_subzone)
    vertex_indices = h3.cell_to_vertexes(candidate_subzone)

    for vertex_index in vertex_indices:
        if vertex_index == 0: continue

        vertex_lat, vertex_lng = h3.vertex_to_latlng(vertex_index)

        interpolated_lat, interpolated_lng = slerp(
            vertex_lat, vertex_lng, centroid_lat, centroid_lng, 0.01
        )

        coarser_zone = h3.latlng_to_cell(interpolated_lat, interpolated_lng, ancestor_resolution)

        if coarser_zone == ancestor_zone:
            return True

    return False

def _process_neighbor(args):
    """
    Helper function for multiprocessing. It processes a single neighbor
    and returns its valid descendants.
    """
    ancestor_zone, neighbor, target_resolution = args
    valid_descendants = []
    descendants = h3.cell_to_children(neighbor, target_resolution)
    for descendant in descendants:
        if zoneHasSubZone(ancestor_zone, descendant):
            valid_descendants.append(descendant)
    return valid_descendants


def getGeometricSubZones(ancestor_zone, relative_depth):
    """
    Calculates all geometrically contained sub-zones within an ancestral zone
    at a specific relative depth, with parallel processing for rings.
    """
    ancestor_resolution = h3.get_resolution(ancestor_zone)
    target_resolution = ancestor_resolution + relative_depth

    if not 0 <= target_resolution <= 15:
        raise ValueError("Relative depth results in an invalid resolution.")

    # Get the centroid child
    centroid_lat, centroid_lng = h3.cell_to_latlng(ancestor_zone)
    centroid_child = h3.latlng_to_cell(centroid_lat, centroid_lng, ancestor_resolution + 1)

    candidate_cells = []

    # 1. Add all logical descendants of the centroid child (sequential)
    if centroid_child is not None:
        centroid_descendants = h3.cell_to_children(centroid_child, target_resolution)
        candidate_cells.extend(centroid_descendants)

    # 2. Get ring 1 neighbors of the centroid child
    ring1_neighbors = list(h3.grid_ring(centroid_child, 1))

    # 3. Get ring 2 neighbors of the centroid child
    ring2_neighbors = list(h3.grid_ring(centroid_child, 2))

    # Prepare arguments for parallel processing
    ring_neighbors_to_process = []
    for neighbor in ring1_neighbors:
        ring_neighbors_to_process.append((ancestor_zone, neighbor, target_resolution))
    for neighbor in ring2_neighbors:
        ring_neighbors_to_process.append((ancestor_zone, neighbor, target_resolution))

    # Parallelize processing of the rings
    # Use 'map' to maintain the order of the results
    if ring_neighbors_to_process:
        with Pool(cpu_count()) as pool:
            # The 'map' function ensures the results list has the same order
            # as the input list, preserving the ring order.
            parallel_results = pool.map(_process_neighbor, ring_neighbors_to_process)

        # Collect results in the correct order
        for result_list in parallel_results:
            candidate_cells.extend(result_list)

    return candidate_cells

# --- Expanded Test Suite ---
if __name__ == "__main__":
    relative_depths = [1, 2, 3, 4, 5, 6]

    # Select one specific pentagon from the list for testing
    pentagon_res3 = list(h3.get_pentagons(3))
    pentagon_res2 = list(h3.get_pentagons(2))

    test_cases = [
        ("Hexagonal_NYC_Res5", h3.latlng_to_cell(40.7128, -74.0060, 5)),
        ("Hexagonal_London_Res4", h3.latlng_to_cell(51.5074, -0.1278, 4)),
        ("Pentagonal_SouthPole_Res3", pentagon_res3[0]),
        ("Pentagonal_Atlantic_Res2", pentagon_res2[0]),
        ("Hexagonal_HighRes_Res10", h3.latlng_to_cell(34.0522, -118.2437, 10)),
        ("Hexagonal_Res3_at_-80_0", h3.latlng_to_cell(-80.0, 0.0, 3)),
    ]

    print("Starting comprehensive test suite...\n")
    for name, parent_h3 in test_cases:
        parent_resolution = h3.get_resolution(parent_h3)
        parent_type = "Pentagonal" if h3.is_pentagon(parent_h3) else "Hexagonal"

        print(f"--- Testing {name} ({parent_type}, Res={parent_resolution}, H3={parent_h3}) ---")

        for relative_depth in relative_depths:
            target_resolution = parent_resolution + relative_depth

            if target_resolution > 15:
                print(f"  Skipping depth {relative_depth}: Target resolution {target_resolution} exceeds max (15).")
                continue

            try:
                start_time = time.perf_counter()
                geometric_subzones = getGeometricSubZones(parent_h3, relative_depth)
                end_time = time.perf_counter()
                duration = end_time - start_time
                print(f"  Found {len(geometric_subzones)} geometric sub-zones at resolution {target_resolution} (Depth {relative_depth}). Took {duration:.4f} seconds.")
            except ValueError as e:
                print(f"  Error testing depth {relative_depth}: {e}")

        print("-" * 30 + "\n")

Java version :

/**
 * Python code provided by Jerome Saint-Louis to ensure H3 subchild results match DGGRS requirements.
 *
 *
 * @author Jerome Saint-Louis (Ecere), original code in python
 * @author Johann Sorel (Geomatys), java port
 */
public final class H3Ext {

    /**
     * Performs spherical linear interpolation between two geographic points.
     * Assumes a perfect sphere model for simplicity.
     */
    public static LatLng slerp(LatLng start, LatLng end, double ratio) {
        final double startLatRad = Math.toRadians(start.lat);
        final double startLngRad = Math.toRadians(start.lng);
        final double endLatRad   = Math.toRadians(end.lat);
        final double endLngRad   = Math.toRadians(end.lng);

        final double sinStartLatRad = Math.sin(startLatRad);
        final double cosStartLatRad = Math.cos(startLatRad);
        final double sinEndLatRad   = Math.sin(endLatRad);
        final double cosEndLatRad   = Math.cos(endLatRad);

        final double cosOmega = sinStartLatRad * sinEndLatRad
                              + cosStartLatRad * cosEndLatRad * Math.cos(endLngRad - startLngRad);
        final double omega = Math.acos(cosOmega);

        if (omega == 0.0) {
            return new LatLng(start.lat, start.lng);
        }

        final double sinOmega = Math.sin(omega);
        final double a = Math.sin((1.0 - ratio) * omega) / sinOmega;
        final double b = Math.sin(ratio * omega) / sinOmega;

        final double x = a * cosStartLatRad * Math.cos(startLngRad)
                       + b * cosEndLatRad   * Math.cos(endLngRad);
        final double y = a * cosStartLatRad * Math.sin(startLngRad)
                       + b * cosEndLatRad   * Math.sin(endLngRad);
        final double z = a * sinStartLatRad + b * sinEndLatRad;

        final double lat = Math.atan2(z, Math.sqrt(x*x + y*y));
        final double lng = Math.atan2(y, x);

        return new LatLng(Math.toDegrees(lat), Math.toDegrees(lng));
    }

    /**
     * Checks if a candidate H3 index is geometrically contained within an
     * ancestral H3 index using a vertex-based boundary check with SLERP.
     */
    public static boolean zoneHasSubZone(long ancestorZone, long candidateSubzone) {
        final int candidateRes = H3Index.getResolution(candidateSubzone);
        final int ancestorRes  = H3Index.getResolution(ancestorZone);
        if (candidateRes <= ancestorRes) {
            return false;
        }

        final LatLng centroid = H3Dggrs.H3.cellToLatLng(candidateSubzone);
        final List<Long> vertexIndices = H3Dggrs.H3.cellToVertexes(candidateSubzone);

        for (Long vertexIndex : vertexIndices) {
            if (vertexIndex == 0) continue;

            final LatLng vertex = H3Dggrs.H3.vertexToLatLng(vertexIndex);
            final LatLng interpolated = slerp(vertex, centroid, 0.01);
            final long coarser_zone = H3Dggrs.H3.latLngToCell(interpolated.lat, interpolated.lng, ancestorRes);

            if (coarser_zone == ancestorZone) {
                return true;
            }
        }

        return false;
    }

    /**
     * Calculates all geometrically contained sub-zones within an ancestral zone
     * at a specific relative depth, using a rigorous topological search and
     * the specified optimization.
     */
    public static LongStream getGeometricSubZones(long ancestorZone, int relativeDepth) {
        final int ancestorRes = H3Index.getResolution(ancestorZone);
        final int targetRes = ancestorRes + relativeDepth;
        if (targetRes < 0 || targetRes > 15) {
            throw new IllegalArgumentException("Relative depth results in an invalid resolution.");
        }

        final long centerChild = H3Dggrs.H3.cellToCenterChild(ancestorZone, ancestorRes+1);
        final List<Long> rings = new ArrayList<>();
        rings.addAll(H3Dggrs.H3.gridRing(centerChild, 1));
        rings.addAll(H3Dggrs.H3.gridRing(centerChild, 2));

        LongStream stream = LongStream.empty();
        stream = LongStream.concat(stream, H3Dggrs.H3.cellToChildren(centerChild, targetRes).stream().mapToLong(Long::longValue));
        for (Long r : rings) {
            LongStream s = H3Dggrs.H3.cellToChildren(r, targetRes)
                    .parallelStream()
                    .filter((Long l) -> zoneHasSubZone(ancestorZone, l))
                    .mapToLong(Long::longValue);
            stream = LongStream.concat(stream, s);
        }

        return stream;
    }

}

jsorel avatar Nov 13 '25 08:11 jsorel

This could also potentially be implemented more efficiently by remaining entirely in the projected space, rather than involving the forward/inverse projection.

Instead of doing a SLERP 99% of the distance on the authalic sphere to check whether a candidate zone is a sub-zone, this movement could be done in the icosahedral space, avoiding the need to convert zone IDs to geographic coordinates and convert geographic coordinates back to zone IDs.

However, this movement in the icosahedral space would need to be able to cross icosahedral interruptions when necessary, and I'm not sure whether the internal coordinate systems of H3 are suitable for this purpose.

We are working on a paper describing the convenience of the 5x6 rhombic/icosahedral space (Figure B.1) facilitating such operations.

There are also alternative / more efficient ways to perform this sub-zone listing. In DGGAL, we are able to directly iterate through the sub-zones without having to build a candidate super set and then having to additionally check and discard outliers, with the added advantage of ordering sub-zones in a scanline-based deterministic order that facilitates kernel convolution within a larger ancestral zone.

This current approach is the easiest way of producing the correct result using the existing public API.

jerstlouis avatar Nov 13 '25 19:11 jerstlouis

Interesting.

I'm wondering if there is not an alternative way to compute this using the existing public API 🤔

Naively I would convert the ancestor zone into a geometry, and then tile it with hexagons of the target size (ancestor resolution + relative depth). The output is visually similar to what you obtain (not sure if it's 100% equivalent though).

Image

Generated with h3o-cli(CLI exposing H3 API for quick prototyping/processing pipeline):

  • h3o-cli cellToPolygon -i 81457ffffffffff for the ancestor zone outlines
  • h3o-cli cellToPolygon -i 81457ffffffffff | h3o-cli geomToCells --resolution 4 --mode covers to generate the coverage
    • add | h3o-cli cellToBoundary -f geojson to get the boundaries as shown in the screenshot

Where:

  • 81457ffffffffff is the ancestor zone (resolution 1)
  • relative_depth is 3 (so target resolution is 1 + 3 => 4)

If it's the same, I'm wondering which approach is actually faster (would need to benchmark your approach). In my Rust implementation I've done significant work to speedup the tiling algorithm (H3 used to way slower, but that might be less true today with the new experimental coverage algorithm).

grim7reaper avatar Nov 13 '25 21:11 grim7reaper

@grim7reaper

I believe that @jsorel had initially made attempts using polygonToCellsExperimental() which did not return the correct results. The documentation for polygonToCells() says:

Containment is determined by centroids of the cells

In OGC API - DGGS, a sub-zone is defined as:

zone at a greater refinement level than a parent zone whose geometry is at least partially contained within the geometry of the parent zone

(in DE9-IM terminology, the ancestor and sub-zone overlap)

In your picture above, you have at least some mismatches at the vertices which only "touch" and are not considered sub-zones, and there are a few others nearby which are fully outside/only touching (it is inconsistent for the different edges). It is a bit odd because the centroid of some of those cells are definitely not within the ancestral cell boundary? Is the documentation not accurate about what the library is doing? (EDIT: I see you used geomToCells(), which I can't find in the online library API documentation).

Image

The proper 7H sub-zones follow the distinctive mod 5 pattern for odd relative-depth. At odd relative depths, 7H/H3 sub-zones are either fully within the ancestral zone, or along the edges there is exactly 11/12th, 13/18th, 1/2, 5/18th, or 1/12th overlap with the ancestral zone. At a relative depth 3 like your example there should be exactly 379 sub-zones (the formula for hexagonal ancestor is 7^depth + 5 * 7^((depth-1)/2) + 1). For even relative depths, the possibilities along the zone edges are only the full zone or 1/2 (and the count of sub-zones for a hexagonal ancestor is 7^depth + 7^(depth/2) - 1).

Something that is important to consider with the geometric approach is that what the actual geometry of H3 zones consists of 1 or 2 great circle arcs on the internal authalic sphere for each edge.

As far as I know there is currently no H3 function to retrieve a "refined" geometry (cellToBoundary() returns only the minimal number of vertices depending on the 1 or 2 great circle arcs -- 2 is for edges crossing icosahedral faces). Most geometric relation engines would assume a direct cartesian line between two points, so naively using this would normally not produce the correct results. Your image above does not seem to suffer from that issue though, perhaps because it is on a large scale, or for some other reason?

There are other related considerations regarding the fact that H3 works on an authalic sphere, but WGS84 geodetic coordinates are on an ellipsoid. In engines working with WGS84 coordinates the "great circles" would be those of the WGS84 ellipsoid rather than the ones of the authalic sphere which H3 assumes internally, yielding yet another disconnect. This is also an ambiguity with regards to whether geographic coordinates passed to / returned by H3 are geodetic latitude which is taken verbatim as latitude on the authalic sphere, or whether a conversion from geodetic latitude to authalic latitude should be performed. We probably need to open a separate issue about this which we've been discussing intensively in the pilot and for the registration of an H3 DGGRS with OGC (or possibly two -- one implying the geodetic <-> authalic conversion, and one not).

I believe the geometric approach will almost certainly not perform as well as the approach for the code included above, and if we were able to improve this to work directly in the icosehedral space it would perform even better. Better yet is to simply iterate in the icosahedral space without considering outliers at all.

The list also needs to be very precise, and the order deterministic, because this is what enables the use of a local index or a simple ordering of data values for encoding raster or vector data quantized to sub-zones, instead of using global zone identifiers, for encodings such as DGGS-JSON and DGGS-JSON-FG.

Some benchmarking of the Java implementation from @jsorel using 6 CPU cores of an Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz from 2018.

--- Testing Hexagonal_NYC_Res5 Hexagonal, Res=5, H3=599718752904282111 ---
  Found 55 geometric sub-zones at resolution 7 (Depth 2). in 4ms
  Found 379 geometric sub-zones at resolution 8 (Depth 3). in 12ms
  Found 2449 geometric sub-zones at resolution 9 (Depth 4). in 35ms
  Found 17053 geometric sub-zones at resolution 10 (Depth 5). in 142ms
  Found 117991 geometric sub-zones at resolution 11 (Depth 6). in 831ms
--- Testing Hexagonal_London_Res4 Hexagonal, Res=4, H3=594920100834836479 ---
  Found 55 geometric sub-zones at resolution 6 (Depth 2). in 2ms
  Found 379 geometric sub-zones at resolution 7 (Depth 3). in 4ms
  Found 2449 geometric sub-zones at resolution 8 (Depth 4). in 19ms
  Found 17053 geometric sub-zones at resolution 9 (Depth 5). in 113ms
  Found 117991 geometric sub-zones at resolution 10 (Depth 6). in 708ms
--- Testing Pentagonal_SouthPole_Res3 Pentagonal, Res=3, H3=590112357393367039 ---
  Found 46 geometric sub-zones at resolution 5 (Depth 2). in 1ms
  Found 316 geometric sub-zones at resolution 6 (Depth 3). in 3ms
  Found 2041 geometric sub-zones at resolution 7 (Depth 4). in 17ms
  Found 14211 geometric sub-zones at resolution 8 (Depth 5). in 98ms
  Found 98326 geometric sub-zones at resolution 9 (Depth 6). in 674ms
--- Testing Pentagonal_Atlantic_Res2 Pentagonal, Res=2, H3=585609238802333695 ---
  Found 46 geometric sub-zones at resolution 4 (Depth 2). in 1ms
  Found 316 geometric sub-zones at resolution 5 (Depth 3). in 3ms
  Found 2041 geometric sub-zones at resolution 6 (Depth 4). in 16ms
  Found 14211 geometric sub-zones at resolution 7 (Depth 5). in 98ms
  Found 98326 geometric sub-zones at resolution 8 (Depth 6). in 609ms
--- Testing Hexagonal_HighRes_Res10 Hexagonal, Res=10, H3=622229149845946367 ---
  Found 55 geometric sub-zones at resolution 12 (Depth 2). in 3ms
  Found 379 geometric sub-zones at resolution 13 (Depth 3). in 4ms
  Found 2449 geometric sub-zones at resolution 14 (Depth 4). in 22ms
  Found 17053 geometric sub-zones at resolution 15 (Depth 5). in 139ms
  Skipping depth 6: Target resolution 16 exceeds max (15).
--- Testing -80, 0 Hexagonal, Res=3, H3=594185223340556287 ---
  Found 55 geometric sub-zones at resolution 5 (Depth 2). in 1ms
  Found 379 geometric sub-zones at resolution 6 (Depth 3). in 3ms
  Found 2449 geometric sub-zones at resolution 7 (Depth 4). in 15ms
  Found 17053 geometric sub-zones at resolution 8 (Depth 5). in 90ms
  Found 117991 geometric sub-zones at resolution 9 (Depth 6). in 634ms

If we were to skip the forward/inverse projection we could achieve much better performance. Similar results of DGGAL for IVEA7H which has a very similar 7H refinement strategy to H3, here using a single CPU core:

Testing with ancestral zone: C0-29-A on DGGRS IVEA7H

--- Running getSubZoneWGS84Centroids() tests ---
Depth 1: Found 13 centroids. Took 0.0000 seconds.
Depth 2: Found 55 centroids. Took 0.0000 seconds.
Depth 3: Found 379 centroids. Took 0.0001 seconds.
Depth 4: Found 2449 centroids. Took 0.0005 seconds.
Depth 5: Found 17053 centroids. Took 0.0034 seconds.
Depth 6: Found 117991 centroids. Took 0.0225 seconds.
Depth 7: Found 825259 centroids. Took 0.1508 seconds.
Depth 8: Found 5767201 centroids. Took 1.0432 seconds.
Depth 9: Found 40365613 centroids. Took 7.2877 seconds.

========================================

--- Running getSubZones() tests ---
Depth 1: Found 13 sub-zones. Took 0.0000 seconds.
Depth 2: Found 55 sub-zones. Took 0.0000 seconds.
Depth 3: Found 379 sub-zones. Took 0.0004 seconds.
Depth 4: Found 2449 sub-zones. Took 0.0001 seconds.
Depth 5: Found 17053 sub-zones. Took 0.0124 seconds.
Depth 6: Found 117991 sub-zones. Took 0.0051 seconds.
Depth 7: Found 825259 sub-zones. Took 0.5976 seconds.
Depth 8: Found 5767201 sub-zones. Took 0.2349 seconds.
Depth 9: Found 40365613 sub-zones. Took 29.7921 seconds.

which is roughly 750 times faster than the Java + H3 version if we consider 5 ms vs. 634 ms at depth 6, with 1 vs 6 cores

This is the scanline iteration method in the 5x6 space.

jerstlouis avatar Nov 13 '25 22:11 jerstlouis

I believe that @jsorel had initially made attempts using polygonToCellsExperimental() which did not return the correct results.

Could you expand on what the issue was? I would expect the overlapping containment mode may work as desired? Was the issue cartesian vs non-cartesian containment (if so, would #1052 address that)?

isaacbrodsky avatar Nov 14 '25 00:11 isaacbrodsky

@isaacbrodsky #1052 / geodesic polyfill is certainly a great improvement!

@jsorel could you please explain what the issue was with polygonToCellsExperimental(), and perhaps give it another try with a version including that PR?

Regardless, I am highly confident that the approach suggested in this issue's description is much faster than any kind of geometric approach, especially if it the movement can be improved to remain in the projected / icosahedral space as opposed to going back to geographic coordinates.

That is because:

  • All of the descendants of the centroid child are automatically "all in" -- no geometric or any kind of containment check is necessary; it is entirely based on the logical indexing
  • The other sub-zones requiring checks would need a simple linear movement in the projected space (plus any kind of icosahedron face jump when necessary, possibly for each vertex), and a projected space -> zone ID conversion (which should be very optimal), avoiding any kind of inverse projection if remaining in the projected space (and removing the need for the SLERP).

This approach also guarantees a deterministic and meaningful sub-zone order, which would be very difficult with any kind of geometric approach (unless applying an additional step like a sort by H3 64-bit integer identifier).

jerstlouis avatar Nov 14 '25 04:11 jerstlouis

@jsorel could you please explain what the issue was with polygonToCellsExperimental(), and perhaps give it another try with a version including that PR?

I had 3 different behaviours :

  • some extra zones (like the above case)
  • infinite (or at least verry long queries) that I had to stop the thread.
  • crash (OutOfMemory error)

Note : I was using the java jni h3 binding.

jsorel avatar Nov 14 '25 07:11 jsorel

Regardless, I am highly confident that the approach suggested in this issue's description is much faster than any kind of geometric approach, especially if it the movement can be improved to remain in the projected / icosahedral space as opposed to going back to geographic coordinates.

I wouldn't be so sure of that. Using the geometrical approach I get the following timing (using h3o):

Starting comprehensive test suite...
--- Testing Hexagonal_NYC_Res5 (Hexagonal, Res=5, H3=852a1073fffffff) ---
  Found 19 geometric sub-zones at resolution 6 (Depth 1). Took 0 ms.
  Found 70 geometric sub-zones at resolution 7 (Depth 2). Took 1 ms.
  Found 391 geometric sub-zones at resolution 8 (Depth 3). Took 2 ms.
  Found 2504 geometric sub-zones at resolution 9 (Depth 4). Took 6 ms.
  Found 17106 geometric sub-zones at resolution 10 (Depth 5). Took 14 ms.
  Found 118340 geometric sub-zones at resolution 11 (Depth 6). Took 40 ms.
------------------------------
--- Testing Hexagonal_London_Res4 (Hexagonal, Res=4, H3=84194adffffffff) ---
  Found 15 geometric sub-zones at resolution 5 (Depth 1). Took 0 ms.
  Found 69 geometric sub-zones at resolution 6 (Depth 2). Took 0 ms.
  Found 387 geometric sub-zones at resolution 7 (Depth 3). Took 1 ms.
  Found 2504 geometric sub-zones at resolution 8 (Depth 4). Took 3 ms.
  Found 17102 geometric sub-zones at resolution 9 (Depth 5). Took 9 ms.
  Found 118338 geometric sub-zones at resolution 10 (Depth 6). Took 34 ms.
------------------------------
--- Testing Pentagonal_SouthPole_Res3 (Pentagonal, Res=3, H3=830800fffffffff) ---
  Found 15 geometric sub-zones at resolution 4 (Depth 1). Took 0 ms.
  Found 53 geometric sub-zones at resolution 5 (Depth 2). Took 0 ms.
  Found 322 geometric sub-zones at resolution 6 (Depth 3). Took 0 ms.
  Found 2081 geometric sub-zones at resolution 7 (Depth 4). Took 2 ms.
  Found 14250 geometric sub-zones at resolution 8 (Depth 5). Took 7 ms.
  Found 98597 geometric sub-zones at resolution 9 (Depth 6). Took 30 ms.
------------------------------
--- Testing Pentagonal_Atlantic_Res2 (Pentagonal, Res=2, H3=820807fffffffff) ---
  Found 15 geometric sub-zones at resolution 3 (Depth 1). Took 0 ms.
  Found 57 geometric sub-zones at resolution 4 (Depth 2). Took 0 ms.
  Found 325 geometric sub-zones at resolution 5 (Depth 3). Took 0 ms.
  Found 2094 geometric sub-zones at resolution 6 (Depth 4). Took 2 ms.
  Found 14262 geometric sub-zones at resolution 7 (Depth 5). Took 7 ms.
  Found 98663 geometric sub-zones at resolution 8 (Depth 6). Took 29 ms.
------------------------------
--- Testing Hexagonal_HighRes_Res10 (Hexagonal, Res=10, H3=8a29a1d7575ffff) ---
  Found 18 geometric sub-zones at resolution 11 (Depth 1). Took 0 ms.
  Found 70 geometric sub-zones at resolution 12 (Depth 2). Took 0 ms.
  Found 388 geometric sub-zones at resolution 13 (Depth 3). Took 1 ms.
  Found 2503 geometric sub-zones at resolution 14 (Depth 4). Took 3 ms.
  Found 17105 geometric sub-zones at resolution 15 (Depth 5). Took 9 ms.
  Skipping depth 6: Target resolution 16 exceeds max (15).
------------------------------
--- Testing Hexagonal_Res3_at_-80_0 (Hexagonal, Res=3, H3=83ef84fffffffff) ---
  Found 19 geometric sub-zones at resolution 4 (Depth 1). Took 0 ms.
  Found 67 geometric sub-zones at resolution 5 (Depth 2). Took 0 ms.
  Found 388 geometric sub-zones at resolution 6 (Depth 3). Took 0 ms.
  Found 2502 geometric sub-zones at resolution 7 (Depth 4). Took 2 ms.
  Found 17103 geometric sub-zones at resolution 8 (Depth 5). Took 9 ms.
  Found 118322 geometric sub-zones at resolution 9 (Depth 6). Took 34 ms.

Which seems 10x faster, and it's not multithreaded. Would need to run the Java bench on my machine for this to be comparable to the numbers you provided though. And staying in the icosahedral space might allow to bridge the gap, indeed.

In any case this doesn't change the fact that the output is not what you are expecting (deterministic ordering, cartesian vs non-cartesian, touches vs intersects, etc.). So even if your approach is slower, it still more correct than this one anyway. This was just to show that carefully optimized geo algorithm can be faster than expected.

grim7reaper avatar Nov 14 '25 22:11 grim7reaper

Thanks for testing that @grim7reaper .

Granted that this is faster than the Java/Python code in the description which uses the H3 APIs and needs to perform the inverse projection.

But the optimal approach for our 7H implementation (on the same CPU type as the Java benchmark, but remaining in the icosahedral space, and in that case using an iteration approach rather than the outlier discarding test) in a single thread computes the depth 6 in 5 ms compared to the 34-40 ms.

So I still strongly believe that there is a more optimal approach to the optimized geometry insersection, even if it could be theoretically be done accurately. Noting also that the "correct" geometric intersection, which would need to use the great circles if done in the geographic space, possibly as implemented by #1052, may be more computationally intensive.

I did not review #1052, so I'm not sure whether this is implemented by using great circles (essentially SLERP like our approach in the description). In my opinion, the correct way to do the geodesic polyfills is actually to remain in the icosahedral space, where great circles are straight lines for the gnomonic projections (within a single icosahedral space), avoiding the need for those SLERPs and inverse projections (the clipping polygon can simply be projected to the gnomonic projection instead). Then the polyfill / geometric intersection test simplifies to a 2D polygon / polygon intersection test. But still, the iteration approach remains faster, because it is essentially a simple single vector addition for each zone, rather than a polygon/polygon intersection test (which remains quite costly).

Thanks again very much for considering this use case / functionality!

jerstlouis avatar Nov 14 '25 23:11 jerstlouis