pandana
pandana copied to clipboard
[Question] Find nearest POI to POI from another category
Hello - Thanks for the great work on this package!
I just spent a couple of hours digging through the code of pandana to find an answer to this question, but couldn't find any solutions. I'd appreciate any help!
The problem that I have is what follow: I have a number of buildings (node POIs), and a number of schools (node POIs). I want to find the nearest school to each building, using the network from OSM.
It's possible to accomplish this through a nearest neighbour look-ups using scipy's KDTree
or scikit-learn's BallTree
, but those provide euclidian or great circle distances, and what I want is the network distance. Is there a suggested way to go about this using pandana?
In my digging, I found the find_all_nearest_pois
function, but that gives me the nearest POI for the network's nodes, not the nodes from a category that I specify.
Many thanks!
Hi @majdal,
My guess is that the brute force approach of calculating the distance from every building to every school -- and then selecting the closest one -- would actually work pretty well here.
Since the recent v0.5 update, point-to-point network distances are super fast to calculate. Section 2 of this notebook has some code examples: Pandana-demo.ipynb
It's on our wish list to expose direct functionality like you're describing, but right now I think it's buried deep inside the graph traversal algorithms.
Something here might be helpful too: shortest-paths.ipynb. This notebook uses NetworkX instead of Pandana to calculate distances, which guarantees you're getting exactly the shortest path but is much slower. The last cell has an example of selecting the closest destination for each origin.
One final suggestion -- if there are so many buildings (millions, say) that the pure brute-force approach is too slow, you could add a preprocessing step where you use euclidean distances to identify a handful of candidate schools for each building, and then use Pandana to figure out which has the shortest network distance.
Hope this helps, and best of luck!
Actually, it might be easier and/or faster to just calculate the closest school to each network node, and then match the buildings that you're interested in to the relevant nodes.
The function you linked to is actually in the underlying C++ code -- here's the corresponding Python function: http://udst.github.io/pandana/network.html#pandana.network.Network.nearest_pois
But find_all_nearest_pois https://github.com/UDST/pandana/blob/dev/src/cyaccess.pyx#L101 gives you the distance to one of your categories (say schools) for all nodes in the network, so you have the entire surface of nearest POIs for the whole space of the network. So you map the buildings to nodes (using get_node_ids) and just do a loc on the DataFrame returned by find_all_nearest_pois right?
I don't think the method Sam mentions gives you any more precision since it will still be node-to-node paths. The advantage of Sam's method is you will get the paths not just which school it is and the distance. Like Sam says, I'd just run shortest path queries between buildings and schools if you need the paths.
On Tue, Sep 8, 2020 at 10:31 AM Sam Maurer [email protected] wrote:
Hi @majdal https://github.com/majdal,
My guess is that the brute force approach of calculating the distance from every building to every school -- and then selecting the closest one -- would actually work pretty well here.
Since the recent v0.5 update, point-to-point network distances are super fast to calculate. Section 2 of this notebook has some code examples: Pandana-demo.ipynb https://github.com/UDST/pandana/blob/dev/examples/Pandana-demo.ipynb
It's on our wish list to expose direct functionality like you're describing, but right now I think it's buried deep inside the graph traversal algorithms.
Something here might be helpful too: shortest-paths.ipynb https://github.com/smmaurer/cp255/blob/master/07-street-networks/shortest-paths.ipynb. This notebook uses NetworkX instead of Pandana to calculate distances, which guarantees you're getting exactly the shortest path but is much slower. The last cell has an example of selecting the closest destination for each origin.
One final suggestion -- if there are so many buildings (millions, say) that the pure brute-force approach is too slow, you could add a preprocessing step where you use euclidean distances to identify a handful of candidate schools for each building, and then use Pandana to figure out which has the shortest network distance.
Hope this helps, and best of luck!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/UDST/pandana/issues/146#issuecomment-689028101, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOHNICXJNSXNZWC2R2SKMTSEZS6HANCNFSM4Q7KPQRQ .
Thanks @smmaurer and @fscottfoti for your very quick replies!
I like Sam's suggestion, of finding the nearest school to each node, then matching the buildings with nodes. I'm still not sure what would be the best way of connecting the building to a node.
One (awkward) way of doing so would be to also run the nearest_pois
but with category = buildings
, then joining the two two dataframes by node, and removing duplicate buildings. But this seems overly cumbersome.
@majdal
Hey friend, I got a similar problem and solved using OSMnx to get the "real distances" based on the street network and got the nearest using KDtree.
I had to find the nearest schools to a set of polygons centroids. To understand better check this notebook maps .
basically the algorithm will stretch the graph of the urban network.
nA = np.array(list(zip(polygon.geometry.x, polygon.geometry.y)) )
nB = np.array(list(zip(schools_preEscolar.geometry.x, schools_preEscolar.geometry.y)) )
btree = cKDTree(nB)
dist, idx = btree.query(nA, k=1)
polygon['place'] = idx
polygon['dist'] = dist
Hello. Is this feature on a development roadmap, please?