pyrosm icon indicating copy to clipboard operation
pyrosm copied to clipboard

unexpected behavior when using bounding boxes

Open rohanaras opened this issue 2 years ago • 0 comments

Describe the bug Following this pandana example with an OSM reader object created using a bounding box raises a ValueError on the osm.to_graph step.

To Reproduce Steps to reproduce the behavior:

fp = get_data("washington", directory="../data/pyrosm")
print("Data was downloaded to:", fp)
osm = OSM(fp, bounding_box=bbox)

nodes, edges = osm.get_network("walking", nodes=True)

G = osm.to_graph(nodes, edges, graph_type="pandana")

where bbox is a bounding box that cuts through an edge.

Expected behavior The osm.to_graph function should produce a graph without additional tinkering on either the nodes or edges dataframes.

Environment:

  • OS: macOS 11.1
  • Python package source: PyPi
  • Python 3.8.13
  • pyrosm==0.6.1

Additional context Here's the error log from the code

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/Users/rohanaras/Documents/repos/.../scripts/pyrosm_test.ipynb Cell 8' in <cell line: 1>()
----> [1](vscode-notebook-cell:/Users/rohanaras/Documents/repos/.../scripts/pyrosm_test.ipynb#ch0000008?line=0) G = osm.to_graph(nodes, edges, graph_type="pandana")

File ~/.local/share/virtualenvs/.../lib/python3.8/site-packages/pyrosm/pyrosm.py:822, in OSM.to_graph(self, nodes, edges, graph_type, direction, from_id_col, to_id_col, edge_id_col, node_id_col, force_bidirectional, network_type, retain_all, osmnx_compatible, pandana_weights)
    808     return to_networkx(
    809         nodes,
    810         edges,
   (...)
    819         osmnx_compatible,
    820     )
    821 elif graph_type == "pandana":
--> 822     return to_pandana(
    823         nodes,
    824         edges,
    825         direction,
    826         from_id_col,
    827         to_id_col,
    828         node_id_col,
    829         force_bidirectional,
    830         network_type,
    831         retain_all,
    832         pandana_weights,
    833     )

File ~/.local/share/virtualenvs/.../lib/python3.8/site-packages/pyrosm/graphs.py:305, in to_pandana(nodes, edges, direction, from_id_col, to_id_col, node_id_col, force_bidirectional, network_type, retain_all, weight_cols)
    302 nodes = nodes.set_index("id", drop=False)
    303 nodes = nodes.rename_axis([None])
--> 305 return _create_pdgraph(nodes, edges, from_id_col, to_id_col, weight_cols)

File ~/.local/share/virtualenvs/.../lib/python3.8/site-packages/pyrosm/graph_export.pyx:168, in pyrosm.graph_export._create_pdgraph()

File ~/.local/share/virtualenvs/.../lib/python3.8/site-packages/pyrosm/graph_export.pyx:181, in pyrosm.graph_export._create_pdgraph()

File ~/.local/share/virtualenvs/.../lib/python3.8/site-packages/pandana/network.py:93, in Network.__init__(self, node_x, node_y, edge_from, edge_to, edge_weights, twoway)
     87 self.node_idx = pd.Series(np.arange(len(nodes_df), dtype="int"),
     88                           index=nodes_df.index)
     90 edges = pd.concat([self._node_indexes(edges_df["from"]),
     91                    self._node_indexes(edges_df["to"])], axis=1)
---> 93 self.net = cyaccess(self.node_idx.values,
     94                     nodes_df.astype('double').values,
     95                     edges.values,
     96                     edges_df[edge_weights.columns].transpose()
     97                                                   .astype('double')
     98                                                   .values,
     99                     twoway)
    101 self._twoway = twoway
    103 self.kdtree = KDTree(nodes_df.values)

File src/cyaccess.pyx:61, in pandana.cyaccess.cyaccess.__cinit__()

ValueError: Buffer dtype mismatch, expected 'long' but got 'double'

After reading the relevant issues in the pandana project (see here), I figured out that this issue is due to the fact that nodes.id does not contain all of ids included in edges.u or edges.v. Plotting the relevant edges makes it clear that this is likely due to how the bounding box code is implemented when creating the OSM object. The following code:

missing_nodes = (set(edges.u) | set(edges.v)) - set(nodes.id)
edges.loc[(edges.u.isin(missing_nodes)) | (edges.v.isin(missing_nodes)), :].plot()

produces: output

Adding a line of code to clean up the edges gdf fixes the problem, of course, but it would be nice to not have to remember to do that.

I imagine that this could be fixed by adding the following line of code around here:

edges = edges.loc[edges.u.isin(nodes.id) & edges.v.isin(nodes.id)]

Some alternate ideas:

  • pyrosm.frames.prepare_geodataframe could be modified to include the "lost" nodes or exclude the "extra" edges
  • A note on this behavior could be included in the documentation for pyrosm.pyrosm.OSM.get_network and pyrosm.networks.get_network_data

rohanaras avatar Jun 20 '22 23:06 rohanaras