rustworkx icon indicating copy to clipboard operation
rustworkx copied to clipboard

Dataframe output of all_pairs_dijkstra_shortest_paths?

Open thomasaarholt opened this issue 1 year ago • 1 comments

What is the expected enhancement?

I experience that it is very slow to convert the dictionary returned by all_pairs_dijkstra_shortest_paths on a large graph into a dataframe. Unnesting the dict into three lists (of 1.6 Million rows) works well, but when I read this into a dataframe as follows, it takes an extremely long time and can crash my system.

import polars as pl
mapping = all_pairs_dijkstra_shortest_paths(...)
left, right, path = nested_dict_to_df(mapping)
df_distances = pl.DataFrame(
    [
        pl.Series(name="left", values=left, dtype=pl.Int32),
        pl.Series(name="right", values=right, dtype=pl.Int32),
        pl.Series(name="path", values=path, dtype=pl.List(Float64)),
    ]
)

I believe this is due to the dictionary entries not being contiguous in memory, or some similar effect. It would be very nice to be able to export rustworkx output directly to a dataframe for further processing.

thomasaarholt avatar Feb 27 '24 15:02 thomasaarholt

I think this is aligned with #1033, I don't think it will be easy to implement but it is our most requested integration.

In the meantime, maybe you can use .explode() with .apply() to see if it performs slightly better? Something along these lines of Pandas code to avoid Python loops which are known to be slow. It might be able to be ported to Polars as well.

import rustworkx as rx
import pandas as pd

graph = rx.generators.heavy_square_graph(5)
all_paths = rx.all_pairs_dijkstra_shortest_paths(graph, lambda _: 1.0)

df = pd.DataFrame(
    {
        "U": all_paths.keys(),
        "V": all_paths.values(),
    }
)

df = df.explode("V")
df["U->V"] = df.apply(lambda x: all_paths[x["U"]][x["V"]], axis=1)

If you print(df.tail()):

     U   V                                               U->V
64  64   1  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 2...
64  64   5  (64, 44, 60, 18, 59, 17, 58, 16, 57, 35, 53, 1...
64  64  25  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 6...
64  64  45  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 2...
64  64   0  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 2...

IvanIsCoding avatar Feb 28 '24 03:02 IvanIsCoding