pathway icon indicating copy to clipboard operation
pathway copied to clipboard

Add a possibility to restrict the maximum batch size

Open zxqfd555 opened this issue 9 months ago • 0 comments

Is your feature request related to a problem? Please describe. Right now, the batch sizes are controlled by the autocommit_duration_ms parameter that is defined in milliseconds. However, it leads to an imbalance in the batch sizes when the data is read in real time versus when an initial chunk of data is read.

Describe the solution you'd like There are two solutions to the issue. They are both connected:

  • Provide a way to limit the maximum batch size. It should be possible if the batch size is restricted within a single Pathway worker.
  • The solution above may lead to accelerated backlog growth: the limit can increase the number of batches, which would slow down the general computation in general. Therefore, there is also a need to add a feature to suspend input reading when the backlog size exceeds a certain value.

zxqfd555 avatar Apr 04 '25 13:04 zxqfd555