LightGraphs.jl icon indicating copy to clipboard operation
LightGraphs.jl copied to clipboard

Performance of diameter

Open bkamins opened this issue 5 years ago • 8 comments

I have a gh graph on 37700 vertices and 289003 edges.

Calculation of its diameter like this:

maximum(maximum(gdistances(gh, i)) for i in vertices(gh))

takes around 250 seconds (probably this is not an optimal algorithm for the task but at least it works).

However, when I run diameter(gh) the process takes so long that I did not wait till it finished.

bkamins avatar Jan 23 '21 14:01 bkamins

This is not a bug but performance issue, but a bug label was automatically added.

bkamins avatar Jan 23 '21 14:01 bkamins

I think this might be, because diameter works for the general case where an additional distance matrix is used. Internally it then uses dijkstra_shortest_paths instead of gdistances. Maybe we should have a special case for when no distance matrix is used.

simonschoelly avatar Jan 23 '21 21:01 simonschoelly

In general using gdistances is also not optimal I think. igraph is much faster on this graph in calculation of diameter. I think that in general either e.g. Floyd-Warshall or Johnson's algorithm should be used (depending on the sparsity of the graph), but unfortunately currently I have too much work to get DataFrames.jl to 1.0 release to dig into LightGraphs.jl sources - sorry for this. (or maybe there are some special algorithms that are custom made for diameter - I have not checked)

bkamins avatar Jan 23 '21 21:01 bkamins

I am not sure if Floyd-Warshall is much more performant unless one finds a way to avoid allocating the temporary |V| x |V| distance matrix. Best would probably to just do some benchmark with different algorithms & graphs, if someone got time for that.

There might also be the possiblity for a parallel diameter algorithm

simonschoelly avatar Jan 24 '21 13:01 simonschoelly

All-source bfs should be faster than FW for unweighted graphs. For weighted I think dijkstra will still beat FW but tests should confirm.

And we should definitely specialize distance metrics for unweighted.

sbromberger avatar Jan 24 '21 14:01 sbromberger

igraph does BFS. F-W allocates but if graph is dense this should not be a problem as then |V| cannot be large anyway.

bkamins avatar Jan 24 '21 16:01 bkamins

Would something like https://who.rocq.inria.fr/Laurent.Viennot/road/papers/ifub.pdf be overkill for our purposes?

Also, when you talk about distance metrics to specialize, what do you have in mind besides the diameter?

gdalle avatar Feb 20 '21 11:02 gdalle

Also, when you talk about distance metrics to specialize, what do you have in mind besides the diameter?

Sorry for the delayed response. In general, if you have unweighted graphs, where we're currently using Dijkstra, we can simply substitute BFS. There's no need to track weights in an unweighted graph, and BFS is more performant specifically because we don't need to do these comparisons at each traversal.

sbromberger avatar Jun 04 '21 11:06 sbromberger