pandapower
pandapower copied to clipboard
Why not using GeoPandas?
I was using GeoPandas for some geospatial analysis and i liked it a lot! I was wondering why pandapower is based on pure pandas and not (also) on GeoPandas? Instead of having extra DataFrames for geocoordinates (like net.bus_geodata) a GeoDataframe could contain geocoordiantes and technical Data as well!
We also use geopandas for our work. I believe we wanted to minimize dependencies, which is why we didn't include geopandas, which is not easily installable on windows. It wasn't even installable through Anaconda at the time, which made it very difficult to get. Now that it is installable on Anaconda, we could think about including it, at least as an option...
In case you want to create geopandas.GeoSeries from bus_geodata:
import pygeos
import geopandas as gpd
import numpy as np
def bus_geoseries(net):
points = pygeos.points(np.array([ net.bus_geodata.x.values, net.bus_geodata.y.values ]).T)
return gpd.GeoSeries(points, index=net.bus_geodata.index.values).rename('geometry')
From line geodata:
import shapely
import geopandas as gpd
def line_geoseries(net):
lines = [shapely.geometry.LineString(coords) for coords in net.line_geodata.coords.values]
return gpd.GeoSeries(lines, index=net.line_geodata.index.values).rename('geometry')
In case you only have bus_geodata, and not line_geodata, you can create linestrings by connecting the points:
import geopandas as gpd
import pandas as pd
import numpy as np
import pygeos
def line_geoseries(net):
from_point = pd.merge(net.line.from_bus, net.bus_geodata, left_on='from_bus', right_index=True, how='left')[['x', 'y']]
to_point = pd.merge(net.line.to_bus, net.bus_geodata, left_on='to_bus', right_index=True, how='left')[['x', 'y']]
x = np.array([ from_point.x.values, to_point.x.values ]).T
y = np.array([ from_point.y.values, to_point.y.values ]).T
lines = pygeos.linestrings(x, y)
return gpd.GeoSeries(lines, index=from_point.index.values).rename('geometry')
@jkisse and me recently talked about the suboptimal fact that in test_auxiliary.py
several testing lines are skipped due to not including gepandas as dependancy -> one more point pro including geopandas to the requirements.
Actually, we once had the plan to remove the tables "bus_geodata" and "line_geodata" at all.
Instead, we introduce the column "geo" in bus and line (and maybe other element, too). This geo column would be of type object (string) and contain geojson objects.
If needed, helper functions allow to translate to and from GeoDataFrames.
Advantages:
- easily serializable
- dependency to geopandas only if you want to use geometric operations
- elements can be represented as any geographic object
- the geojson standard defines the order of lat lon
The implementation of this already exists here at Fraunhofer (we use this already in some projects - and it works fine).
We should come up with a plan for the transition into pandapower. I think the reason this is not done yet is mainly because it touches a lot of code.
I've added geopandas to the GitHub Actions build (manually) (test logs here). Now, fewer tests are skipped and it works fine, so I will merge it soon. Apparently there are some issues with Python 3.10 which might be connected to geopandas.
I'd suggest that we don't make geopandas a "hard" requirement but rather introduce a new extra-requirement (e.g. "all" -> pip install pandapower[all]
) which would include the comprehensive dependency list, including geopandas and all the plotting & testing dependencies.
like Leon said, we decided to avoid geopandas as a dependecy because it can be a pain to install the whole necessary stack
so please do not include it as a (hard) dependecy
So as I see it now, it would be the best to remove bus_geodata and line_geodata and introduce geojson ( 1. value is longitude, 2. value is latitude) as the new standard. Convenience functions allow the conversion to geodataframes and should be open source.
Necessary steps:
- change all plotting functions (matplotlib and plotly based) to the new standard
- publish convenience functions for converting into GeoDataFrames
- adapt the convert_format to allow the reading of older networks
- write tutorials for plotting with geojson
- adapt tests to the new standard
- adapt documentation and communicate the new standard
I know this is a rather old issue, but just in case someone comes across this and would like to know how to convert geodata to gis.
But just to answer the first question in a more compact way. There are functions to convert geodata to geopandas geoDataFrames and back. (example code, not tested but it should be correct)
import pandapower.plotting.geo as geo
net = pandapower.networks.mv_oberrhein()
geo.convert_geodata_to_gis(net)
# now net.line_geodata and net.bus_geodata are geoDataFrames
geo.convert_gis_to_geodata(net)
# now they have been converted back
On the topic of geojson: There is currently a pr ( #1731 ) that supports exporting as geojson. It adds all attributes of 'bus' and 'line' tables to the properties of the geojson feature. This is usefull for visualizing/editing pandapower networks in qgis. see pandapower-qgis It may be possible to use this as a basis for a converter to update old networks if pandapower changes the way it handles geodata.
I close this issue due to the helpful comment of @KS-HTK and merged PR #1731.