streamz icon indicating copy to clipboard operation
streamz copied to clipboard

How to keep some sort of ordering between joined streams?

Open limx0 opened this issue 7 years ago • 3 comments

Say I have two source streams on kafka, one updating at 1 message per second, and one updating at 5 messages per second. I am using a combine_latest to join the streams. What would be the most appropriate way to not allow the streams to get more than ~10s out of sync? I effectively want to push futures back up one stream, blocking until more data arrives on the second.

Or in another scenario, suppose one of the sources dies temporarily. I really would like to be able to hault any more emits until the source comes back to life.

limx0 avatar Mar 12 '18 03:03 limx0

Is there some code which says that the source died? If so then you could maybe push all the data into a buffer while waiting for the source to come back.

Similarly with the first scenario, you could have a conditional which depending on the time out of sync either pushes data through the normal route, or to a buffer until new data comes in.

Maybe?

CJ-Wright avatar Mar 12 '18 03:03 CJ-Wright

Yeah I like your second idea @CJ-Wright. I will whip something up - I wonder whether others might find it useful enough for a PR.

I also just wanted to double check there wasn't some way to do something like this already.

limx0 avatar Mar 12 '18 03:03 limx0

I've used a combination of joining nodes, filter, and pluck to essentially gate certain paths. Maybe two gates would work for pushing data directly into the processing or into a buffer.

CJ-Wright avatar Mar 12 '18 13:03 CJ-Wright