Joe Hellerstein

Results 33 comments of Joe Hellerstein

We should at minimum flag the ones that are not automated, and add them to a release checklist.

Our existing pipeline hashjoin implements Pipeline Semi-Naive (PSN, Loo, '06) ... there is no wasted work in recursive (relational) joins. It would be worth double-checking that PSN works for multisets...

Very few operators have params other than code blocks or file names. Maybe just implement `source_dynamic_interval` or something?

Consider whether it releases in its own crate or not.

See #929 -- same issue

Another example: Sort is preceded by Handoff, but we could have a HeapHandoff or something that would do most of Sort, and save rebuffering.

Required: - StatelessHandoff doesn't own state, it participates in the state of its surrounding components Future Optimization Rule: - Rewrite rules to identify when we can replace Handoff with StatelessHandoff

Arguably our philosophy could be that the Hydroflow runtime itself does not provide intrinsic mechanisms for efficient checkpoint. Instead we defer checkpoint to a system either beneath Hydroflow, or built...

Seems like we need to carefully think over what metrics we want. Then we can decide how to implement/surface these. Could be internal to ops, could be plumb-able in the...

Generally would be nice to have an operator API for DEBUG/LOG stream out of an operator. As regards "work is done", this seems operator-specific. One possible API: ``` chain =...