gleam
gleam copied to clipboard
Running function on all executors after completion of flow
Is there a way to register a function "callback" that runs on all executors in distributed mode? For example, I would like to flush a batched pipeline on all executors after the entire flow is finished. Currently, I'm calling OutputRow
in my flow, but that only runs once and not on all my executors. Do
also does a similar thing where it only runs on the driver, but not the executors. How can I force a function to run on all executors/agents, not just the driver?
This is not supported yet.
This can be done by extending Mapper and Reducer, returning a few initialization/cleanup functions. And then in gio/mapper.go and gio/reducer.go, calling those functions.
I will try to get to this.