topojson
topojson copied to clipboard
Conversion to Typology object causes overlaps
Hello and thank you for your work on this project.
I was noticing an issue with a multipolygon layer I have which results from a conversion from a raster layer. There are five different strata in this example and there are no overlaps with the initial data. I am interested in creating a topojson and then simplifying the boundaries.
After conversion to a Topology
object, there are a number of overlapping areas which result. Unless I misunderstand what this library should do, the resulting layer should not have any overlaps.
This is illustrated below when plotting the strata with a bit of transparency. In the most obvious case, in central area you can see a brown where stratum 2 overlaps with strata 3; this area should be entirely strata 3. There are other areas of overlap as well.
import topojson as tp
import geopandas as gpd
import matplotlib.pyplot as plt
fl_strata = "strata.geojson"
strata_data = gpd.read_file(fl_strata)
topo = tp.Topology(strata_data, prequantize=False)
topo_gdf = topo.to_gdf()
topo_gdf.plot(categorical=True, cmap='tab10', legend=True, column='StrataID', alpha=0.6)
plt.show()
Example data is attached. example.zip
Thanks again for all your help, Dave
You are right. I can reproduce this with solely StrataID
is 2
as well:
strata_data = strata_data[strata_data.StrataID.isin(["2"])]
strata_data.plot(categorical=True, cmap='tab10', legend=True, column='StrataID', alpha=0.6)
and
(
tp.Topology(strata_data, prequantize=False)
.to_gdf()
.plot(categorical=True, cmap='tab10', legend=True, column='StrataID', alpha=0.6)
)
The good thing though, is that it serialise OK into altair:
tp.Topology(strata_data, prequantize=False).to_alt(color='properties.StrataID:N')
Meaning that the creation of topojson is going good, but the serialization into a GeoJSON not, relevant code here [https://github.com/mattijn/topojson/blob/master/topojson/utils.py#L478](https://github.com/mattijn/topojson/blob/master/topojson/utils.py#L478)
Simplified example to continue testing:
from shapely import geometry
import topojson as tp
mp = geometry.shape({
"type": "MultiPolygon",
"coordinates": [
[
[[0, 0], [20, 0], [10, 20], [0, 0]], # CCW
[[3, 2], [10, 16], [17, 2], [3, 2]], # CW
],
[[[6, 4], [14, 4], [10, 12], [6, 4]]], # CCW
]
})
topo = tp.Topology(mp, prequantize=False)
topo.to_alt(color='type:N') # this goes well
topo.to_gdf().plot(alpha=0.5) # this not
topo.to_gdf().iloc[0].geometry.wkt # newly introduced ')' and '('s
'MULTIPOLYGON (((0 0, 20 0, 10 20, 0 0)), ((3 2, 17 2, 10 16, 3 2)), ((6 4, 14 4, 10 12, 6 4)))'
mp.wkt # original wkt
'MULTIPOLYGON (((0 0, 20 0, 10 20, 0 0), (3 2, 10 16, 17 2, 3 2)), ((6 4, 14 4, 10 12, 6 4)))'
I've added a test here and have been trying to understand what happens or what should happen around here in the Hashmap
class:
arcs = []
for geom in geoms:
arcs_in_geom = self._data[bk_objects][geom] # bk_objects = "bookkeeping_geoms"
for idx_arc, arc_ref in enumerate(arcs_in_geom):
arc_ids = self._data[bk_element][arc_ref] # bk_element = "bookkeeping_arcs"
if len(arc_ids) > 1 and key != "coordinates":
self._inner = True if idx_arc > 0 else False
arc_ids = self._backward_arcs(arc_ids)
arcs.append(arc_ids)
return arcs
After Dedup
the object looks as follow:
from topojson.core.dedup import Dedup
from topojson.core.hashmap import Hashmap
Dedup(
{'bbox': (0.0, 0.0, 20.0, 20.0),
'bookkeeping_arcs': [[0], [1], [2]], <<<<<<<<<
'bookkeeping_coords': [],
'bookkeeping_duplicates': [],
'bookkeeping_geoms': [[0, 1], [2]], <<<<<<<<<
'bookkeeping_shared_arcs': [],
'coordinates': [],
'junctions': [<shapely.geometry.point.Point object at 0x7fb94fe7e520>,
<shapely.geometry.point.Point object at 0x7fb953843df0>,
<shapely.geometry.point.Point object at 0x7fb953843e80>],
'linestrings': [array([[ 0., 0.],
[20., 0.],
[10., 20.],
[ 0., 0.]]),
array([[ 3., 2.],
[10., 16.],
[17., 2.],
[ 3., 2.]]),
array([[ 6., 4.],
[14., 4.],
[10., 12.],
[ 6., 4.]])],
'objects': {0: {'arcs': [0, 1], 'type': 'MultiPolygon'}},
'type': 'Topology'}
)
This still seems to be okay. bookkeeping_geoms
is [[0, 1], [2]]
, so len()
is two and bookkeeping_arcs
is [[0], [1], [2]]
meaning that the linestrings do not contain shared segments.
But after Hashmap
it looks as follow:
Hashmap(mp)
Hashmap(
{'bbox': (0.0, 0.0, 20.0, 20.0),
'coordinates': [],
'linestrings': [array([[ 0., 0.],
[20., 0.],
[10., 20.],
[ 0., 0.]]),
array([[ 3., 2.],
[10., 16.],
[17., 2.],
[ 3., 2.]]),
array([[ 6., 4.],
[14., 4.],
[10., 12.],
[ 6., 4.]])],
'objects': {'data': {'geometries': [{'arcs': [[[0]], [[1]], [[2]]], <<<<<<<<<
'type': 'MultiPolygon'}],
'type': 'GeometryCollection'}},
'type': 'Topology'}
)
len
of the arcs
should stay 2, but has become 3. And the arc
does not reflect the geom
anymore. The nested hole has become a unique polygon in the multipolygon instead of a hole in the first polygon...
It is a bit complicated, help welcome 🙏
This issue is fixed by PR #168 🥳. Result will become:
This took quite some time unfortunately.
Wow, thank you so much for your work. This is excellent! Many thanks!