Use a formal Topological Sort algorithm for the audio dependency graph
Resolving the audio dependency graph into a static, ordered visit-list is fundamentally a dependency resolution challenge with some interesting audio-specific nuances.
Constraints
The key constraints are:
- Nodes have dependencies
- Some nodes can have self-recursive connections, where the node's output becomes its own a parent node's input on the next frame
- We want a static, deterministic order of node processing
- Every node must have all its input dependencies resolved before it can process
Goal
The goal is a linearized list where:
- Every node appears exactly once
- All node dependencies are satisfied before the node is processed
- Circular/recursive dependencies are handled gracefully
Approach
The most natural approach here is to use a Topological Sort algorithm, specifically adapted for handling potential recursive dependencies. A classic Depth-First Search (DFS)-based topological sort won't work perfectly because of the self-referential connections.
Some things to keep in mind:
- Treat self-recursive connections as a special case of circular dependency
- Potentially introduce "delay" or "state" nodes that can break pure cyclic dependencies
- Ensure that nodes with circular dependencies are processed in a way that allows reasonable signal propagation
Possible solution
One potential algorithm structure might look like:
- Start with nodes that have no incoming dependencies
- Progressively mark nodes as "visited" or "processed"
- For nodes with self-recursion, ensure their state can be initialized and that they have a well-defined first-frame behavior
- Potentially use a "staging" mechanism where nodes with circular dependencies get special processing order
In the end, circular dependencies might best be flattened by having two separate buffers for every node being rendered ("previous" and "current"), where current is moved to previous at the start of rendering, and a new, empty buffer to be written to is allocated and set in its place. This will of course lead to latency or feedback, but since it's impossible to have true zero-delay circular feedback, this is likely inevitable.
Was brainstorming how this problem relates to other domains. ChatGPT's response seems helpful for investigating more general solutions to this problem than a pure, topological sort algorithm. Putting the output here for future consideration:
This problem—finding a traversal order through a directed, cyclic graph where each node can depend on outputs from other nodes—is a variation of dataflow execution, and resembles issues in compilers, signal processing, spreadsheet evaluation, and reactive systems. Below are several methods and perspectives to help you tackle this problem, including compromises and domain analogies.
🧠 Problem Restatement
You're given a directed graph:
-
Nodes represent computation units.
-
Edges represent data dependencies (output to input).
-
Cycles are allowed.
The goal is to determine an order of execution such that:
-
A node is run only after (or assuming
Noneif in a cycle) its dependencies are computed. -
On the first run, cycles are broken by allowing
Nonefor missing inputs, but on subsequent runs, outputs should be available.
🔧 Possible Approaches
1. Topological Sort with Cycle Detection + Relaxation
-
Normally, topological sort is used to order nodes in a DAG based on dependencies.
-
Since you have cycles, a pure topological sort fails.
Modified Strategy:
-
Identify strongly connected components (SCCs) using Tarjan's or Kosaraju’s algorithm.
-
Treat each SCC as a "super node".
-
If SCC is acyclic (a single node or linear), it can be topologically ordered.
-
If SCC is cyclic, all nodes in that component are mutually dependent.
-
-
Run SCCs in topological order.
-
Within a cyclic SCC, pick an arbitrary order, or iterate until outputs stabilize (fixed-point execution).
Domain Analogy: Compilers (e.g., dependency resolution in build systems like make), static analysis in dataflow graphs.
Pros:
-
Clean separation between acyclic and cyclic parts.
-
Efficient and grounded in theory.
Cons:
-
Cycles may introduce non-determinism unless managed carefully.
2. Dataflow Scheduling / Worklist Algorithm
-
Maintain a work queue of nodes that are "ready" to run (i.e., their inputs are available).
-
On first run, if some inputs are missing (due to cycles), allow
None. -
Once a node produces output, enqueue its dependents.
-
Repeat until no changes occur (fixed point).
This is similar to how spreadsheets evaluate cells: you can start evaluating independent cells, then re-evaluate dependent cells as data becomes available.
Pros:
-
Natural fit for real-time or lazy evaluation.
-
Can handle asynchronous or dynamic computation.
Cons:
-
Harder to analyze statically.
-
Less predictable order of execution.
3. Fixed-Point Iteration / Stabilization Loop
-
Execute all nodes in some order, allowing
Noneinitially for unresolved inputs. -
After each iteration, check if any outputs changed.
-
Repeat until outputs stabilize.
This is like propagating changes through a circuit or evaluating a spreadsheet with circular references.
Pros:
-
Very general, works even when cycles are complex.
-
Deterministic if node processing is deterministic.
Cons:
-
Potentially inefficient (may require many iterations).
-
Needs change detection to know when to stop.
4. Hybrid Approach: Static Ordering + Runtime Resolution
-
Use best-effort topological sorting, where you:
-
Traverse as much of the graph in dependency order as possible.
-
When encountering cycles, mark them.
-
-
At runtime, execute nodes in the sorted order.
-
For cyclic parts, resolve missing data as
None, then possibly re-run until stable.
-
Pros:
-
Optimized execution for acyclic portions.
-
Still handles cycles gracefully.
Compromise: Accept slightly suboptimal execution order for simpler implementation.
🔄 Compromises / Simplifications
-
Cycle Breaking: You can allow user-specified or heuristically determined "cycle breakers" — inputs that are allowed to be
Noneon the first run. -
One-Pass Approximation: Accept a single-pass execution where some nodes in cycles just get
Nonevalues and require external retriggering. -
Static Execution Order: Precompute a static schedule and rerun all nodes in that order repeatedly.
-
Prioritized Scheduling: Assign priorities to nodes based on some heuristic (e.g., fan-in count), then process in that order.
🔗 Related Domains and Concepts
| Domain | Analogy |
|---|---|
| Compilers | Instruction scheduling, dependency graphs |
| Spreadsheet Engines | Cell dependency and circular reference handling |
| Build Systems (e.g., make) | Dependency trees with cycle detection |
| Functional Reactive Programming (FRP) | Signal graph evaluation with feedback loops |
| Dataflow Programming | Node execution based on input availability |
| Digital Circuits / Simulators | Fixed-point iterations over feedback paths |
| Petri Nets | Token-based models of data readiness |
🧩 Example Workflow
-
Parse graph and build adjacency list.
-
Compute SCCs to identify cycles.
-
Topologically sort SCCs.
-
Execute nodes in:
-
Topological order for acyclic parts.
-
Repeated loop (fixed-point) for cyclic SCCs.
-
✅ Recommendations
If you're prioritizing correctness and determinism, go with the SCC-based + fixed-point approach.
If you're prioritizing simplicity, go with a worklist approach where unresolved inputs are None, and rerun as needed.
If you need performance, precompute as much as possible (topological order + cycle info), and optimize execution using memoization or dirty flags.
Would you like to see a code-level example of one of these strategies?