ocamlgraph icon indicating copy to clipboard operation
ocamlgraph copied to clipboard

Please add an algorithm to calculate the condensation of a directed cyclic graph

Open josch opened this issue 9 years ago • 0 comments

It would be nice if ocamlgraph could provide an implementation that calculates the condensation of a cyclic graph. This would allow representing all strongly connected components of a graph as a single vertex and would make the graph acyclic. A callback function could make this generic.

I would prepare a pull request but I'm unsure which algorithm best fits the datastructures used by ocamlgraph.

I tried the following two approaches which do not seem to differ significantly in their speed. They are non-generic and I only want to use them to show the approaches I have tried. My graph has two vertex kinds, PkgV.Pkg and PkgV.SCC where the former only contains a single item and the latter a set of them.

Number 1:

List.iter (function | [] | [_] -> () | scc ->
    let pset = List.fold_left (fun acc v -> match v with
        | PkgV.SCC _ -> fatal "cannot condense graph with SSC vertices"
        | PkgV.Pkg p -> Set.add p acc) Set.empty scc
    in
    let sccv = PkgV.SCC pset in
    let vmemofscc = function
      | PkgV.SCC _ -> false
      | PkgV.Pkg p -> Set.mem p pset
    in
    List.iter (fun v ->
        G.iter_pred (fun pred ->
          G.remove_edge g pred v;
          if not (vmemofscc pred) then G.add_edge g pred sccv) g v;
        G.iter_succ (fun succ ->
          G.remove_edge g v succ;
          if not (vmemofscc succ) then G.add_edge g sccv succ) g v;
        G.remove_vertex g v;
      ) scc;
) (C.scc_list g);
g

Number 2:

let to_scc_vert = function
  | [] -> fatal "scc cannot be empty"
  | [v] -> v
  | scc ->
    let pset = List.fold_left (fun acc v -> match v with
        | PkgV.SCC _ -> fatal "cannot condense graph with SSC vertices"
        | PkgV.Pkg p -> Set.add p acc) Set.empty scc
    in PkgV.SCC pset
in
let sccs = Array.of_list (List.map to_scc_vert (C.scc_list g)) in
let cg = G.create () in
let mapping = Hashtbl.create (G.nb_vertex g) in
Array.iteri (fun i v ->
    match v with
    | PkgV.Pkg _ -> Hashtbl.add mapping v i
    | PkgV.SCC s -> Set.iter (fun p -> Hashtbl.add mapping (PkgV.Pkg p) i) s
  ) sccs;
G.iter_edges (fun v1 v2 ->
  let scc1 = Hashtbl.find mapping v1 in
  let scc2 = Hashtbl.find mapping v2 in
  if scc1 <> scc2 then
    G.add_edge cg sccs.(scc1) sccs.(scc2)
) g;
cg

Using my input graphs (Debian dependency graphs with 51674 vertices and 286301 edges) the first algorithm takes about 43 seconds to execute and the second takes about 38 seconds.

This seems a bit much because I'm building this graph in only 17 seconds and I'm traversing all its vertices in only 6 seconds. Computing C.scc_list only takes 2 seconds so the rest is spent in either moving edges around (in the first version) or building a new graph from a mapping (in the second version).

I wonder if there is an algorithm that would perform better?

josch avatar Apr 09 '15 11:04 josch