gleam icon indicating copy to clipboard operation
gleam copied to clipboard

Running function on all executors after completion of flow

Open svmhdvn opened this issue 6 years ago • 1 comments

Is there a way to register a function "callback" that runs on all executors in distributed mode? For example, I would like to flush a batched pipeline on all executors after the entire flow is finished. Currently, I'm calling OutputRow in my flow, but that only runs once and not on all my executors. Do also does a similar thing where it only runs on the driver, but not the executors. How can I force a function to run on all executors/agents, not just the driver?

svmhdvn avatar Nov 05 '18 23:11 svmhdvn

This is not supported yet.

This can be done by extending Mapper and Reducer, returning a few initialization/cleanup functions. And then in gio/mapper.go and gio/reducer.go, calling those functions.

I will try to get to this.

chrislusf avatar Nov 06 '18 01:11 chrislusf