Graphs.jl
Graphs.jl copied to clipboard
Working with edge indices
In many settings, it is useful to have direct access to the index of an edge (u,v), i.e. its rank in the iterator edges(g).
For mutable graphs this is complicated to maintain, but when graphs no longer evolve it can bring significant speedups. Anyway, I'm opening the discussion with Graphs.jl 2.0 in mind
Maybe this could also be implemented without breaking changes, by having a function that calculates the edge rank for an edge. So maybe we do not need to wait for v2.0. Multigraphs might need some separate consideration.
Indeed for SimpleGraphs it is reasonably easy to do.
My only fear is that if we add this to the API and start using it elsewhere (for instance in algorithms), we will need a fallback for custom graph types. Unfortunately, the only fallback I can think of is something like
function edge_rank(g, s, d)
for (k, e) in enumerate(edges(g))
if src(e) == s && dst(e) == d
return k
end
end
end
which is unbearably slow.
I think that we should avoid working with edge indices as much as possible. They are not inherent to the graph and should be treated as an implementation detail. Any graph algorithm should produce the same result under permutation of these numbers. I take this property “invariant under permutation of the edge list” as a defining characteristic of a graph algorithm.
Giving users access to this information will lead to code that relies on that ordering, which prohibits a large amount of flexibility in implementation and performance optimization. Lazily defined graphs and distributed graphs get a lot harder if you have this additional constraint.
But sometimes you do need that ordering. For instance, in my current research, I use edge weights that are computed by a neural network. That means they are naturally given as a vector rather than a sparse matrix. The same goes for flow capacity constraints or costs.
I'm not saying every Graphs.jl algorithm should use this functionality, maybe none should. But it would be nice to expose it.
I agree with @jpfairbanks, there is not so much use to add it in the API. In most cases, users can just do collect(edges(g)), and build themselves the indices. Moreover, in this proposition of @simonschoelly, isless would give a lexicographical ordering for edges (which is of more general usefulness), and hash would allow to index with edges in a dictionary, which basically do what you propose. Even if its less performant than integer indices, its generally easy to implement, even for lazy graphs.
The IndexedGraphs package looks relevant here :smile:
EDIT: here's the link https://github.com/stecrotti/IndexedGraphs.jl
Closing this, the upcoming GraphsBase.jl will alleviate such needs