graphology icon indicating copy to clipboard operation
graphology copied to clipboard

Enable topological traversal of a DAG starting with any node

Open dmurvihill opened this issue 1 year ago • 8 comments

I'm implementing a priority queue with lazy comparisons. To keep my DAG balanced, I'll be comparing the root nodes that have the lowest height. Want to get your thoughts before I start coding.

Does this look like a good outline to you?

dmurvihill avatar Feb 19 '24 21:02 dmurvihill

@dmurvihill I might be mistaken, but isn't height of the DAG (or of a node in the DAG) and a DAG generation the same concept? In which case, you can leverage the generation as exposed by the forEachNodeInTopologicalOrder function's callback?

Yomguithereal avatar Feb 19 '24 21:02 Yomguithereal

Yes, I'll probably use code from there. However, my use case requires operating on subgraphs so I will need to expose a helper that accepts a root node as an argument.

dmurvihill avatar Feb 20 '24 03:02 dmurvihill

Added the change to forEachTopologicalOrder. I don't particularly like adding the root argument after the callback, but it's the only backwards-compatible way I could think of to do it.

It might seem somewhat wasteful to scan the entire DAG before iterating over a potentially much smaller subgraph, but I chose to preserve the cycle-detection logic rather than performance.

dmurvihill avatar Feb 20 '24 05:02 dmurvihill

Just to be sure I understand here: you are trying to find two DAGs in a DAG forest having the lowest height and being represented by an arbitrary root node?

Yomguithereal avatar Feb 20 '24 12:02 Yomguithereal

I'm trying to find the two lowest root nodes in the DAG forest. They might have descendants in common.

dmurvihill avatar Feb 20 '24 19:02 dmurvihill

Here's my implementation. For partial DAG traversals the neighbor in-degree has to be tracked lazily, which makes cycle detection impossible and increases asymptotic complexity. For those reasons, I kept the existing behavior for full-graph searches.

dmurvihill avatar Feb 20 '24 20:02 dmurvihill

Ok, sorry I am a little slow to understand your case here. If you need to be able to specify a node from which to start the traversal, instead of finding all roots, what do you think of graphology-traversal design. Check the code for the BFS for instance here. You have an "abstract" version of the function to avoid duplication then both specialized bfs and bfsFromNode functions are exported.

I agree that forEachNodeInTopologicalOrderFromRoot is a bit of a mouthful however. Will you raise if given node hasn't a in degree of 0? Or do you intend to allow the traversal to start from any arbitrary node?

Yomguithereal avatar Feb 20 '24 20:02 Yomguithereal

BFS is close, but because it doesn't revisit nodes it will give the shortest path to a leaf whereas I need the longest path. It passes all of the tests in my test suite except this one:

function height(graph, node) {
  let maxDepth = 0;
  (
  node === undefined
  ? bfs.bind(null, graph)
    : bfsFromNode.bind(null, graph, node)
)(
    (_, __, depth) => {
      maxDepth = depth + 1 > maxDepth ? depth + 1 : maxDepth;
    }
  )
  return maxDepth;
}

describe('height')(() => {
    it('should use the greater height for skip-level generations', () => {
      const graph = new DirectedGraph();
      graph.mergeEdge(0, 1);
      graph.mergeEdge(1, 2);
      graph.mergeEdge(0, 2);
      assert.deepStrictEqual(height(graph), 3);
    });
});

However, the change I just pushed to this branch ensures that topologicalGenerations(graph, node).length can be used to find the height.

This algorithm ignores in-neighbors that are not elements of the target subgraph (otherwise it would only work on trees). That's also the reason cycle detection is impossible; there might be a cycle that reaches the ignored in-edge.

Let me know whether you feel this code is appropriate to merge, and whether you'd like to see changes first.

dmurvihill avatar Feb 20 '24 21:02 dmurvihill