libpysal icon indicating copy to clipboard operation
libpysal copied to clipboard

alpha shape multipolygon - is it expected behavior to have filled in holes?

Open cornhundred opened this issue 2 years ago • 10 comments

Hi, I would like to know if it is expected behavior for the alpha shape to be returned with filled in holes? This example colab notebook illustrates this

from libpysal.cg import alpha_shape, alpha_shape_auto
from shapely.geometry import MultiPolygon, Polygon
import numpy as np
import pandas as pd

# generate random matrix
num_cols = 2
num_rows = 1000
np.random.seed(seed=100)
pts = np.random.rand(num_rows, num_cols) * 100

inv_alpha = 3.5
poly = alpha_shape(pts, 1/inv_alpha)

multi_poly = MultiPolygon(poly.values)
multi_poly
Screen Shot 2022-01-29 at 11 38 23 PM

The alpha shape polygon geoseries appears to include polygons that intersect the holes in the alpha shape. Is this expected and if so can we easily remove these in order to combine these polygons into a MultiPolygon?

  • Platform information:
>>> import os; print(os.name, os.sys.platform);print(os.uname())

posix linux
posix.uname_result(sysname='Linux', nodename='b49e8e01b8be', release='5.4.144+', version='#1 SMP Tue Dec 7 09:58:10 PST 2021', machine='x86_64')
  • Python version:
>>> import sys; print(sys.version)

3.7.12 (default, Sep 10 2021, 00:21:48) 
[GCC 7.5.0]
  • SciPy version:
>>> import scipy; print(scipy.__version__)

1.4.1
  • NumPy version:
>>> import numpy; print(numpy.__version__)

1.19.5

cornhundred avatar Jan 30 '22 04:01 cornhundred

Hey, yes the "extra" polygons are holes:

image

This results in some confusing behavior if you assemble a multipolygon using all the alpha shapes as regular polygons... some of the polygons need to be treated as holes, and others as rings.

We don't indicate which polygons are "holes,"but for a specific alpha, it's entirely possible for there to be holes in the alpha hull. So, I think I think the "correct" behavior should be to encode the alpha shape as a (multi)polygon with holes, rather than encode each part as a separate simple polygon...

We'll investigate a fix for this. @darribas, there may be a performance penalty, since we need to orient the rings to figure out which are holes.

ljwolf avatar Feb 01 '22 11:02 ljwolf

Thanks @ljwolf for confirming the expected behavior. We have an approach to remove holes from our alpha shapes that involves 1) looping through the polygons from largest to smallest , 2) checking which smaller polygons are contained within the larger polygon and saving those to a list of 'tentative' holes (using shapely to check for one polygon containing another, but we'll switch to geopandas query_bulk to speed things up), 3) checking if any tentative holes are contained within a larger hole (if a hole is contained within another hole, then it is a new polygon, like an island within a lake) and only keeping 'real' holes. I can share a notebook demonstrating this later.

Also, on a slightly related note. We have been experimenting with setting the triangle filtering criteria (set by the alpha parameter) based on the maximal side length of the triangles instead of the inverse of the radius of the circumcircle. I think it is a little more intuitive to think in terms of this distance - increasing alpha results in larger triangles and a more coarse grained alpha shape. Would you all be interested in allowing the option for setting the alpha parameter in this manner (e.g., alpha_type: "circumcircle" or "max_side_length")? I could try to set up a pull request if there is interest?

cornhundred avatar Feb 01 '22 15:02 cornhundred

to remove holes that involves...

Great! Yes, the test-case I've just cooked up for this is:

numpy.random.seed(100)
inner_r = numpy.random.uniform(low=1, high=2, size=1000)
outer_r = numpy.random.uniform(low=4, high=5, size=1000)
inner_theta = numpy.random.uniform(0,2*numpy.pi, size=1000)
outer_theta = inner_theta = numpy.random.uniform(0,2*numpy.pi, size=1000)

inner = numpy.column_stack((inner_r, inner_theta))
outer = numpy.column_stack((outer_r, outer_theta))

rings_polar = numpy.row_stack((outer, inner))
rings = numpy.column_stack((
    rings[:,0]*numpy.cos(rings[:,1]),
    rings[:,0]*numpy.sin(rings[:,1])
))

For an alpha of 1, you get four geometries:

Figure_1

I think you can arbitrarily adjust the "widths" of the rings to confuse any area-based strategy for sorting the initial search. This is a problem I've attacked elsewhere... I don't think there's a way around query_bulk(..., predicate='contains'), but you can look at the .interiors attribute of a shapely polygon to figure out if there are holes. (e.g. the polygons.iloc[[0]] in your collab notebook has its holes and you can find that immediately by picking geometries with nonempty interiors)

On a slightly related note...

Yes, I can definitely understand that. I preferred specifying it using the radius (not inverse) of the bounding circumcircle... I noticed our implementation uses alpha differently from Edselbrunner when writing construct_bounding_circles(), but we're stuck with it 😄 I would definitely be interested in supporting something where either a keyword alpha or a radius/length must be supplied, but not both.

ljwolf avatar Feb 01 '22 15:02 ljwolf

to remove holes

Great thanks @ljwolf. In our case our alpha shape code (from stack overflow) returns a series of polygons that do not have holes (so the equivalent of polygon boundaries I think) and it is up to us to determine which ones are holes and which ones are 'real' polygons. Since we're using these boundary polygons only then that should guarantee that any hole will be smaller than it's 'parent' polygon (as far as I can tell). But since we are given polygons with holes from PySAL (thanks for the example code) it sounds like we can have a look at the .interiors to try and sort it out. I'll experiment with it on my end using the example 'target' code you shared and post a Colab notebook.

On a slightly related note...

That would be great, I can try to set up a pull request that allows a keyword of radius or length to be provided instead of the default alpha.

cornhundred avatar Feb 01 '22 16:02 cornhundred

hey, thanks very much for this conversation!

I'm not sure if it'd be the most efficient way arount it but, would a spatial join between the polygons and points (or between all polygons with a contains predicate?) resolve this as an ex-post step? Since that'd go through GEOS, I think that should be relatively performant. For the performance hit, maybe this could be an argument that defaults to apply it?

darribas avatar Feb 04 '22 17:02 darribas

Hi @darribas, I think this is the approach @ljwolf is taking here https://github.com/pysal/libpysal/pull/457/files on this pull request.

cornhundred avatar Feb 04 '22 18:02 cornhundred

Sorry, disregard my previous comment, had missed the PR. Looking into it now, and looks fantastic, very clever trick. Will responde there!

darribas avatar Feb 04 '22 18:02 darribas

The 4.6.2 release has fixed this - see updated Google Colab notebook

cornhundred avatar Mar 03 '22 18:03 cornhundred

Hi,

I was testing some other parameters for the alpha shape and it appears to not be behaving as expected. I'm seeing some holes in the multipolygon that appear to be filled in with polygons. these holes end up being filled in if I apply a buffer of zero

see https://colab.research.google.com/drive/1KCCnbwTvg-omi8Fz9UHq0bn1NSfTPl90?usp=sharing

Screen Shot 2022-04-19 at 5 54 15 PM

I'm using libpysal==4.6.2

It almost seems like it is missing holes between two polygons that are touching. It is correctly removing holes within a single polygon.

cornhundred avatar Apr 19 '22 21:04 cornhundred

I think I have a solution that is working to remove these polygons which represent holes made by more than one touching polygons (I'll call them hole-polygons). I calculate which polygons are touching each other and calculate their intersection length with all touched polygons. The hole-polygons should intersect real polygons along their entire length (the sum of the intersection length over all touching polygons). Also, the hole-polygons can also have holes of their own which are actually real polygons - so when I find a hole-polygon I make it's holes real polygons. See the updated notebook for an example.

Screen Shot 2022-04-20 at 4 36 30 PM
def calc_curated_poly_list(gser_poly):
  
  inside, outside = gser_poly.sindex.query_bulk(gser_poly, predicate='touches') 
  touch_mat = sparse.csc_matrix((np.ones_like(inside), (inside, outside))).todense()
  touch_df = pd.DataFrame(touch_mat, index=gser_poly.index.tolist(), columns=gser_poly.index.tolist())

  intersection_dict = {}
  curated_poly_list = []
  fake_poly_list = []
  for inst_poly in gser_poly.index.tolist():

    poly_length = gser_poly[inst_poly].exterior.length

    inst_ser = touch_df.loc[inst_poly]
    inst_touch_list = inst_ser[inst_ser > 0].index.tolist()

    shared_border_length = 0
    for inst_touch in inst_touch_list:

      if inst_touch not in fake_poly_list:

        intersection_name = '-'.join(sorted([str(inst_poly), str(inst_touch)]))

        # prevent re-calc of intersection length
        if intersection_name in intersection_dict:
          touch_length = intersection_dict[intersection_name]
        else:
          inst_length = gser_poly[inst_poly].intersection(gser_poly[inst_touch]).length
          intersection_dict[intersection_name] = inst_length
          touch_length = inst_length

        shared_border_length = shared_border_length + touch_length

    # if shared border length is equal to the border length then the polygon
    # is not real
    fake_poly = np.isclose(poly_length, shared_border_length)
    
    
    if fake_poly == False:

      # add real polygon
      curated_poly_list.append(gser_poly[inst_poly])

    else:
      fake_poly_list.append(inst_poly)

      # add interiors of fake polygon
      for inst_interior in gser_poly[inst_poly].interiors:
        curated_poly_list.append(Polygon(inst_interior))

  return curated_poly_list
  

cornhundred avatar Apr 20 '22 20:04 cornhundred