Dagger.jl
Dagger.jl copied to clipboard
Perform staging in the scheduler
Currently, operations on DArrays are staged into thunks before being passed into the scheduler, removing all information about operation structure. For some operations, such as a matmul, knowing the original structure of a set of operations might allow us to substitute an optimized implementation or hardware-specific communication pattern, even calling out to an external library (like Elemental). Users might also want to provide their own operations, and together with #166, may want to provide more efficient ways to compute them. In short, we should move operation staging into the scheduler, and when #166 is implemented, we can pass the raw operation to the user's scheduler implementation and let it choose whether to stage to thunks, or to do something special.