distributed icon indicating copy to clipboard operation
distributed copied to clipboard

Warn (and eventually raise) when client.scatter is used with Active Memory Manager enabled

Open phofl opened this issue 1 year ago • 4 comments

Using scatter is generally not a good idea anymore and doesn't have any effect if the active memory manager is enabled. People are frequently running into this or are using scatter anyway, which is a bad UX and confusing.

We should raise a warning if scatter is used and point people to delayed and probably raise at some point later or get rid of the method completely.

phofl avatar Nov 04 '24 20:11 phofl

Using scatter is generally not a good idea anymore and doesn't have any effect if the active memory manager is enabled.

That's only partially true. What doesn't have an effect any more is scatter(..., broadcast=True

fjetter avatar Nov 05 '24 16:11 fjetter

Good point, I mixed that up....

Is delayed generally better or is that incorrect?

phofl avatar Nov 05 '24 20:11 phofl

Is delayed generally better or is that incorrect?

In 9 out of 10 times it is better. The difference between the two approaches is that scatter can take a direct path to the worker instead of proxying through the scheduler. At least if the network configuration allows such things.

Even if scatter proxies over the scheduler, the scheduler just forwards the data directly and doesn't store a copy. This matters if the data is actually large since the scheduler has to hold the delayed task in memory until it is completed. This also means that delayed is more robust to failures, of course.

In the end its a tradeoff between slightly better performance and resilience+higher memory usage on the scheduler.

The safe but slightly more costly approach is delayed. Most end users will likely not be able to differentiate this properly and judge the risks/costs properly so the recommendation to use delayed (or client.submit) is certainly good.

fjetter avatar Nov 06 '24 10:11 fjetter

Hi! I guess that this at least should be documented somewhere. The last message from @fjetter really explains things.

There are many places in the doc where scatter is suggested, but I understand that Delaying an object is better and safer in most cases, right?

https://distributed.dask.org/en/stable/locality.html?highlight=scatter https://distributed.dask.org/en/stable/api.html?highlight=scatter#distributed.Client.scatter https://distributed.dask.org/en/stable/memory.html?highlight=scatter https://distributed.dask.org/en/stable/resilience.html?highlight=scatter

guillaumeeb avatar Apr 18 '25 10:04 guillaumeeb