distributed-dataset
distributed-dataset copied to clipboard
Graph-based execution
Currently we wait for a stage to finish before starting another one. It would be nicer to have a proper dependency graph where the nodes are tasks, so we can schedule things more efficiently.