framework
framework copied to clipboard
Feature Request: read ESRI shape files
Overview
A common format in which geo-related public sector data is provided is ESRI shape (shp). Those are ZIP files that contain the geometry and an additional attribute table. Usually only little information is know about the content of the attribute table. This is where Frcitionless could help. If you want to create statistical evaluations, you often do not need a full GIS program and you could treat the shp files like tables.
Using GeoPanadas it is simple to access the data contained in a shape file. Here is an example that uses a file with data on traffic lights:
import geopandas as gpd
shapefile_path = "Lichtsignalanlagen.zip"
gdf = gpd.read_file( f"zip://{shapefile_path}")
print(gdf)
This is the output:
FID NR BEZEICHNUN geometry
0 34016 50 Feldstraße (Kath. Kirche) Beselerstr. POINT (543146.869 5956685.986)
1 33986 33 Westerstraße (B431), Reichenstraße / Vormstege... POINT (543236.506 5955852.612)
2 34030 64 Steindamm / Gooskamp POINT (543789.474 5955973.802)
3 33948 12 Köllner Chaussee / Krückaupark POINT (544497.901 5956416.905)
4 34008 46 Friedensallee / Friedenstraße / Amandastraße POINT (543764.121 5956905.022)
.. ... .. ... ...
62 33980 30 Hamburger Straße / Hainholzer Damm POINT (544255.692 5955584.940)
63 33926 1 Berliner Straße / Schauenburgerstraße / Probst... POINT (543382.916 5956266.351)
64 34004 43 Hainholzer Damm / Wasserstraße / Fröbelstraße POINT (544190.755 5955020.948)
65 73942 78 Hamburger Straße / Feuerwache Süd (Ausfahrt Fe... POINT (543747.192 5955752.459)
66 78293 91 Gärtnerstraße Höhe Hs.-Nr. 31 POINT (542949.420 5956757.481)
[67 rows x 4 columns]
Frictionless already understands the WKT in the geometry column.
I did a little research. Here is code that collects information Here is code that lists the included layers and the attribute table data types.
import geopandas
import fiona
shapefile_path = 'Lichtsignalanlagen.zip'
layer_names = fiona.listlayers(f"zip://{shapefile_path}")
for layer_name in layer_names:
print(layer_name)
gdf = geopandas.read_file(shapefile_path, layer=layer_name)
column_info = []
for column_name, data_type in gdf.dtypes.iteritems():
column_info.append((column_name, str(data_type)))
for name, data_type in column_info:
print(f" name: {name}, type: {data_type}")
Here is an example of a shape file that contains more layers and data types: nag_fach_100.zip