h3 icon indicating copy to clipboard operation
h3 copied to clipboard

[PR Proposal] Add support for min_resolution in compactCells

Open JosephChotard opened this issue 1 year ago • 3 comments

Hi, I'm proposing to add an optional minimum resolution to the compactcells function. Currently the function always tries to compact up to resolution 0. For database joins where range queries are inefficient we need to join on equality at an intermediary resolution then post filter for exact min/max checks at a higher resolution.

For example in this databricks article they do a join using parent resolution 8 but this requires that no cells got compacted to a lower resolution than 8.

SELECT
FROM 
  trip_h3 as t,
  (SELECT
      h3_toparent(c_cell, 8) AS cell_8, *
      FROM taxi_zone_h3c_explode) AS tz
WHERE
  (tz.h3_res = 8 AND t.pickup_cell_8 = tz.cell_8) OR
  (t.pickup_cell_8 = tz.cell_8 AND h3_ischildof(t.pickup_cell, tz.c_cell)) 

(sql query from the article)

If we want to "limit" the minimum resolution of the compacted cells currently we'd have to do something like

cells = polygonToCells(poly, xxxx)
compactedCells = compactCells(cells)
compacteCellsWithMinResolution = flatMap(compactedCells, lambda cell : if (getResolution(cell) < 8) {cellToChildren(cell, 8) else {cell})

(excuse my pseudocode :) )

I think it would be a pretty minor change as the compact Cells function just hardcodes 0 as the min resolution right now. We could allow end users to specify their required min resolition.

Happy to contribute this myself as my first PR if you're open the feature.

JosephChotard avatar Jan 15 '25 20:01 JosephChotard

Here's the python code snippet instead of my pseudocode

import itertools
import h3.api.numpy_int as h3

...

tiled = h3.h3shape_to_cells(shape, max_resolution).tolist()
compacted = h3.compact_cells(tiled)
compacted_min_resolution = list(itertools.chain.from_iterable(
    [h3.cell_to_children(cell, min_resolution) if h3.get_resolution(cell) < min_resolution else [cell] for cell in compacted]
))

JosephChotard avatar Jan 15 '25 20:01 JosephChotard

I was hoping @isaacbrodsky was going to chime in on this. My personal opinion is that it's a good idea, but it should be a new function so we don't have to major-version-bump the C library again. Perhaps partialCompactCells(cell_set, compacted_set, num_cells, min_resolution)

dfellis avatar Jan 19 '25 16:01 dfellis

This makes sense to me. I get the utility of not needing to chain a bunch of operations together at the end.

I like the suggestion from @dfellis to add this as a new function. The existing function can naturally be implemented in terms of the new function.

isaacbrodsky avatar Jan 19 '25 17:01 isaacbrodsky