numaflow
numaflow copied to clipboard
Ordered Processing
Summary
Numaflow improves throughput by pushing the next "work" to any available processing unit and thus inherently does out-of-order processing (similar to other stream processing platforms). However, there are a few cases where the "work" has to be ordered, e.g., the create-update-read-delete flow. You cannot update before you have created an item. Hence, it would be good to support some kind of partitioned FIFO where, within a partition, the work is ordered.
E.g., for a given partition A, Nth should be done only after (N-1)th is done.
CAVEATs
- Auto-scaling
Use Cases
- create-update-read-delete workflows
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
### Tasks
- [ ] SDK should do ordered response
We should still be able to do readBatchSize>1 with ordered processing. Just make sure that the batch read from source/ISB is processed & written in order.
We should still be able to do
readBatchSize>1 with ordered processing. Just make sure that the batch read from source/ISB is processed & written in order.
If we do a readBatchSize>1, we still have to make sure that the udf invocation is sequential, and also the write to ISB is also sequential, which essentially makes it behave like readBatchSize == 1.