streamz
streamz copied to clipboard
dask-streamz based on the actor interface
The stateful actors of Dask closely match the design of streamz nodes.
If checkpoint/restart is part of the actor code, then a major deficit of the current actors model (loss of data when a worker goes down) would be circumvented.
@CJ-Wright , I bet you find this idea fascinating...
Are actors stable enough to use? I've been thinking about making something that would continuously maintain the state for some pre-defined computation as new data comes in. Something like differential dataflow, but in python and less head-expolding.
Also, what happens to an actor if the client exits/crashes? Is the memory on workers released?
They are not, but could be
- https://github.com/dask/distributed/pull/4232
- https://github.com/dask/distributed/pull/4287
if the client exits/crashes? Is the memory on workers released?
If the client stops, then yes, futures would be released, and actors should be cleaned up.