h3-py icon indicating copy to clipboard operation
h3-py copied to clipboard

Allow dict subclasses in function input

Open kylebarron opened this issue 4 years ago • 2 comments

Occasionally it can be helpful to subclass the built-in dict. The geojson package, for example, has a Polygon class that subclasses dict and provides helpers like validating that the contained geometry is valid geojson. However this doesn't work with h3.polyfill:

import h3
from geojson import Polygon

polygon_coords = [[[-9.092559814453125,38.794500078219826],[-9.164314270019531,38.793429729760994],[-9.217529296875,38.76666579487878],[-9.216842651367188,38.68792166352608],[-9.12139892578125,38.70399894245585],[-9.0911865234375,38.74551518488265],[-9.092559814453125,38.794500078219826]]]
polygon = Polygon(polygon_coords)

isinstance(polygon, dict) # True
h3.polyfill(polygon, res=8, geo_json_conformant=True)
# ---------------------------------------------------------------------------
# TypeError                                 Traceback (most recent call last)
# /var/folders/zl/4x9vw27j60n7h41pd6ntb_pc0000gp/T/ipykernel_33807/4145491903.py in <module>
# ----> 1 h3.polyfill(polygon, res=8, geo_json_conformant=True)
#       2 
# 
# ~/.pyenv/versions/3.8.10/lib/python3.8/site-packages/h3/api/_api_template.py in polyfill(geojson, res, geo_json_conformant)
#     522         unordered collection of H3Cell
#     523         """
# --> 524         mv = _cy.polyfill(geojson, res, geo_json_conformant=geo_json_conformant)
#     525 
#     526         return _out_unordered(mv)
# 
# TypeError: Argument 'geojson' has incorrect type (expected dict, got Polygon)

Originally I figured h3-py was doing an assert type(polygon) == dict check, that could easily be switched to isinstance(polygon, dict). But upon closer inspection it looks like this error is actually coming from Cython.

https://github.com/uber/h3-py/blob/c1bee7dff01363895f716ca1add92eb252c1f146/src/h3/_cy/geo.pyx#L195

So it looks like Cython requires the input to be a literal dict instead of a subclass.

The only prior work I could find relating to this is: https://github.com/fastavro/fastavro/issues/139

Possible resolutions

  • Mark as wontfix; users need to coerce to a literal dict.

  • Coerce input to dict inside h3-py but before touching Cython. I think we could just change https://github.com/uber/h3-py/blob/c1bee7dff01363895f716ca1add92eb252c1f146/src/h3/api/_api_template.py#L525 to

    cy.polyfill(dict(geojson), res, geo_json_conformant=geo_json_conformant)
    
  • Change cython function signature. I.e. change https://github.com/uber/h3-py/blob/c1bee7dff01363895f716ca1add92eb252c1f146/src/h3/_cy/geo.pyx#L195 to

    ef polyfill(geojson, int res, bool geo_json_conformant=False):
    

    and then add an assert isinstance(geojson, dict) line. Would removing the dict before geojson affect performance at all?

kylebarron avatar Oct 26 '21 19:10 kylebarron

Options 2 and 3 could be benchmarked versus baseline to see if the impact is significant either way.

Option 3 feels like the best option, since it wouldn't require a new object construction and Cython is likely doing those assert type(arg) === typename assertions under the hood already so swapping with an isinstance-based assertion ought to be immeasurably different (the vast majority of the time should be in the C code doing the polyfill operation).

Option 2 could also be tested as it's trivial to implement, but I personally wouldn't prioritize that one? You can get weird things like:

dict([]) # {}
dict([1, 2, 3, 4]) # TypeError: cannot convert dictionary update sequence element #0 to a sequence
dict([(1, 2), (3, 4)]) # {1: 2, 3: 4}

Just becomes a very unpredictable API if you try to convert the input to a dictionary.

dfellis avatar Oct 26 '21 19:10 dfellis