arcgis-python-api
arcgis-python-api copied to clipboard
FeatureSet to_geojson misformats MultiPolygons
Describe the bug
MultiPolygons are not formatted correctly upon calling to_geojson on a FeatureSet object. Using the GeoJSON string elsewhere results in those multipolygon shapes being empty. It appears to be missing a level of brackets.
To Reproduce Steps to reproduce the behavior:
import geopandas as gpd
import arcgis
gis = arcgis.gis.GIS()
# Load a public city neighborhoods feature service
city_neighborhoods_item = gis.content.get("b2a2da54401a41e78dd71563f01f5273")
# Query and convert to geojson
hoods_gjson = city_neighborhoods_item.layers[0].query().to_geojson
# Convert to GeoDataFrame OR save to file and load in other software
hoods_gdf = gpd.read_file(hoods_gjson, driver="GeoJSON", crs="EPSG:3734")
error: See image. Missing geometry for MultiPolygon.
Screenshots If applicable, add screenshots to help explain your problem.
The Spatially Enabled Dataframe printed via FeatureSet.sdf. Notice the geometry is intact for all rows.
A resulting Geodataframe from GeoJSON
If you enter into GeoJSON validator, it states MultiPolygon coordinates array should have 4 levels of nested arrays and the resulting file is missing one and only at 3 levels. If you manually add a set of brackets, everything works fine.
Expected behavior MultiPolygons are represented with 4 depth brackets in resulting GeoJSON string and load properly in other software.
Platform (please complete the following information):
- OS: WSL2 Ubuntu
- Browser [e.g. chrome, safari]
- Python API Version 2.2.0
To add: I tried using the WKT method on the geoseries accessor for spatially enabled dataframes. And passing that value as to any other software. I'm sure it's the lower-level function being called for to_geojson, so I'm not surprised that also doesn't work.
It is 100% a problem for multipolygons with actual non-contiguous islands. Would really apprecaite any workarounds for this until this is resolved in this product. @nanaeaubry I ask because this is a breaking bug for a lot of our applications if we can't effectively query any island-like geometry.
@nanaeaubry I wrote this and confirms my suspicion that invalid WKT is being produced in this library
Running shapely.validation.make_valid on every shape produced from WKT, then re-reading into other software properly maps the non-contiguous island multipolygons
import arcgis
import geopandas as gpd
# Living Atlas layer for block groups
blockgroup_census = gis.content.get('ebeb65deb5c14f4d8849fd68944b7ee6')
bg_fs = blockgroup_census.query(where="County = 'Cuyahoga County'", return_all_records=True)
bg_gjson = bg_fs.to_geojson
bg_gdf = gpd.read_file(bg_gjson, driver="GeoJSON", crs="EPSG:3857").set_crs(epsg=3857, allow_override=True)
# Produces broken map, the arcgis library isn't formatting island MultiPolygons correctly
bg_gdf.plot()
# Produces full map after asking shapely to fix the bad formatting from `arcgis` package
bg_df = bg_fs.df
shapes = list(map(lambda geom: geom.WKT if geom else None, bg_df['SHAPE']))
new_shapes = [shapely.validation.make_valid(shapely.from_wkt(shape)) for shape in shapes]
bg_df['wkt'] = new_shapes
gpd.GeoDataFrame(bg_df, geometry='wkt').plot()
@dnsohrabian Thank you for all the detail. I am looking into this and working on a workaround. I will keep you posted.
Do you have arcpy in your environment?
We use two libraries to handle these geometries: arcpy and geomet. When a user does not have arcpy it will fall back on geomet so this can help us narrow down where the issue is occurring
Thank you @nanaeaubry for looking into it. No arcpy in my environment.
closing as fix is submitted.
I am still running into this on 2.3.1. Can you explain further what you mean by the fix being submitted? Which version will it be fixed in? What should I look for? @ManushiM
Also running into exact same issue in version 2.4.1. @dnsohrabian Did you ever find a solution?
@ManushiM Echoing @dnsohrabian's comment earlier, which release was this issue supposedly fixed?
Manually fixing the missing brackets as suggested in the OP does seem to be a working solution.
The issue present in both local & cloud development environments (ArcGIS Online Notebooks) where only arcgis is used, not arcpy.
Also running into exact same issue in version 2.4.1. @dnsohrabian Did you ever find a solution?
@ManushiM Echoing @dnsohrabian's comment earlier, which release was this issue supposedly fixed?
Manually fixing the missing brackets as suggested in the OP does seem to be a working solution.
The issue present in both local & cloud development environments (ArcGIS Online Notebooks) where only arcgis is used, not arcpy.
I haven't revisited this in months since I found a workaround. It sounds like the underlying issue persists. I was never explained why this issue was closed and what was meant by "fixed"... I would rather directly be told it's not important enough.
- Use
.sdf()method on aFeatureSet - Create a geopandas GeoDataframe like so:
gpd.GeoDataFrame(sdf.drop('SHAPE', axis=1),geometry=sdf.SHAPE, crs= fset.spatial_reference.get('latestWkid')) - Make a new GeoDataframe via the geopandas
.make_valid()method to clean up geometrygdf.geometry =gdf.geometry.make_valid()
That seems to fix it for me. I suspect it has to do with conflicting topology rules between Esri geometry and OGC standards.
Also running into exact same issue in version 2.4.1. @dnsohrabian Did you ever find a solution?
@ManushiM Echoing @dnsohrabian's comment earlier, which release was this issue supposedly fixed?
Manually fixing the missing brackets as suggested in the OP does seem to be a working solution.
The issue present in both local & cloud development environments (ArcGIS Online Notebooks) where only arcgis is used, not arcpy.
I haven't revisited this in months since I found a workaround. It sounds like the underlying issue persists. I was never explained why this issue was closed and what was meant by "fixed"... I would rather directly be told it's not important enough.
- Use
.sdf()method on aFeatureSet- Create a geopandas GeoDataframe like so:
gpd.GeoDataFrame(sdf.drop('SHAPE', axis=1),geometry=sdf.SHAPE, crs= fset.spatial_reference.get('latestWkid'))- Make a new GeoDataframe via the geopandas
.make_valid()method to clean up geometrygdf.geometry =gdf.geometry.make_valid()That seems to fix it for me. I suspect it has to do with conflicting topology rules between Esri geometry and OGC standards.
Awesome, thanks for the info--I'll try that. Hopefully we get clarification down the line.
@Epoxider Not sure what version Manushi was referring to but normally at the latest release (2.4.1) we were no longer seeing this issue. This version of the API will be in AGOL Notebooks in end of June or beginning of July.
If you are still seeing this please log an issue with support, much appreciated and sorry for the late response!