streamz icon indicating copy to clipboard operation
streamz copied to clipboard

dask-streamz based on the actor interface

Open martindurant opened this issue 4 years ago • 3 comments

The stateful actors of Dask closely match the design of streamz nodes.

If checkpoint/restart is part of the actor code, then a major deficit of the current actors model (loss of data when a worker goes down) would be circumvented.

martindurant avatar Oct 28 '20 13:10 martindurant

@CJ-Wright , I bet you find this idea fascinating...

martindurant avatar Oct 28 '20 14:10 martindurant

Are actors stable enough to use? I've been thinking about making something that would continuously maintain the state for some pre-defined computation as new data comes in. Something like differential dataflow, but in python and less head-expolding.

Also, what happens to an actor if the client exits/crashes? Is the memory on workers released?

roveo avatar Dec 21 '20 17:12 roveo

They are not, but could be

  • https://github.com/dask/distributed/pull/4232
  • https://github.com/dask/distributed/pull/4287

if the client exits/crashes? Is the memory on workers released?

If the client stops, then yes, futures would be released, and actors should be cleaned up.

martindurant avatar Dec 21 '20 17:12 martindurant