hydroflow icon indicating copy to clipboard operation
hydroflow copied to clipboard

design: `poll_futures` is non-ergonomic in that it drives ticks from the middle of a stream

Open jhellerstein opened this issue 1 year ago • 3 comments

a linear pipeline starting with source_iter should probably only run on the first tick. However if it has a poll_futures in it, it may run on subsequent ticks.

This suggests that poll_futures is probably better modeled as a sink (0-output) and source (0-input) operator combination, rather than an inline operator.

We should decide and resolve.

jhellerstein avatar May 06 '24 16:05 jhellerstein

Note that in the new tick formalization where there are no global ticks, this changes even further. Things like futures become extra weird because operators are not allowed to spontaneously generate outputs when not provided new inputs (they are deterministic). So separating into a source/sink with a key to pair them together will help.

shadaj avatar May 07 '24 03:05 shadaj

To add on, I also think that we can absolutely sacrifice ergonomics at the Hydroflow IR level. For example with futures, we can do the source/sink split while preserving a nice inline API at the Hydroflow+ level. With the new scope ticks, something like:

hydroflow_plus::loop(|cur_tick| { // creates a scoped tick context
  let futures_stream = // ...
  let future_results = futures_stream.poll_futures();
  // we can't directly use `future_results` because its values are generated at the tick scope outside of this, so
  future_results.batched().for_each(...)
})

This is nice because it also makes it more explicit that the results of the futures are re-batched / re-ordered.

shadaj avatar May 07 '24 03:05 shadaj

I'm not sure I understand enough about the new tick system to understand the last comment, but otherwise I totally agree. I think .poll_futures makes sense as a Hydroflow+ operator in the same way that .send_to does (implied sink -> source).

When pair programming with Mingwei, we briefly discussed implementing it this way, but he suggested that there wasn't a simple/obvious way to structure this implementation in the existing Hydroflow source, as we basically want an operator split across two subgraphs and it wasn't immediately obvious how to pass future data between them. Thus, we went with the current approach to get an mvp up and running quickly.

RyanAlameddine avatar May 08 '24 18:05 RyanAlameddine

Decision is to remove poll_futures from Hydroflow, and defer to Hydroflow+.

jhellerstein avatar May 20 '24 16:05 jhellerstein

Hmm I am confused. This can't be implemented in purely Hydroflow+, we definitely need a low level operator because it is closely tied to the scheduler.

shadaj avatar May 20 '24 16:05 shadaj

Seems like a design/planning meeting is in order on this topic. Let's agree on the design ASAP -- if we can't get a decision/implementation this week for the release let's remove and reinstitute in the next release.

jhellerstein avatar May 20 '24 16:05 jhellerstein

Could be implemented with an external channel (tokio channel) though maybe against the spirit of keeping things in hydroflow

MingweiSamuel avatar May 20 '24 19:05 MingweiSamuel

Also interesting - if we split into a sink_futures and source_stream and the two operators are in the same stratum, then it may be weird if a future gets sent out and received in the same tick/stratum run

MingweiSamuel avatar May 20 '24 20:05 MingweiSamuel

I'd assume the future handling is idempotent, so i don't think the length of the delay (including 0!) should matter? Worst case we send the future twice and it gets filtered out by idempotence enforcement?

On Mon, May 20, 2024 at 1:08 PM Mingwei Samuel @.***> wrote:

Also interesting - if we split into a sink_futures and source_stream and the two operators are in the same stratum, then it may be weird if a future gets sent out and received in the same tick/stratum run

— Reply to this email directly, view it on GitHub https://github.com/hydro-project/hydroflow/issues/1183#issuecomment-2121124452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC54QJSLVJ5D65C2IMHNBDZDJJ3ZAVCNFSM6AAAAABHJMHLRGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRGEZDINBVGI . You are receiving this because you were assigned.Message ID: @.***>

jhellerstein avatar May 20 '24 20:05 jhellerstein

Current plan is to remove from 0.8 release and redesign for 0.9

jhellerstein avatar Jul 15 '24 16:07 jhellerstein