euphoria icon indicating copy to clipboard operation
euphoria copied to clipboard

Add parallelism control

Open je-ik opened this issue 7 years ago • 1 comments

After removing explicit partitioning, we have currently no explicit control over the parallelism of executing operators. This affects both batch and stream. There must be a way to give a hint to the translator that certain operation should be parallelized more or less than the input. Options are:

  • add a method to set parallelism of operator on executor - e.g.
       Executor executor = ...;
       executor.withParallelism("OPERATOR_NAME", 100).submit(flow);
    
  • add downstream parallelism hint to shuffle operators, e.g.
      ReduceByKey.of(...)
          .keyBy(...)
          ....
          .withHint(Parallelism.of(100));
    
  • some other option?

je-ik avatar Dec 21 '17 09:12 je-ik

I think we should never set explicit parallelism, instead we should hint operator with the percentual estimate of increase / decrease in data size, so we can decide parallelism based on the input data.

dmvk avatar Dec 21 '17 12:12 dmvk