pangeo-cloud-federation
pangeo-cloud-federation copied to clipboard
Long wait time to get dask workers
I have been noticing very long wait times to get dask workers to come online lately.
It just took me ~30 min to get any workers on the pangeo google cloud deployment.
Is there a way to resolve this? @rabernat suggested that "the cluster is maxed out".
For completeness, this is what I do in my notebook (pretty much the recommmended code):
from dask_gateway import GatewayCluster
cluster = GatewayCluster()
# cluster.adapt(minimum=4, maximum=40) # or to a fixed size.
cluster.scale(10)
cluster
Apparently we have a 100 vCPU limit on the cluster, and today we were at that limit.
I just bumped it to 200. (For those with access, the page is here: https://console.cloud.google.com/kubernetes/clusters/details/us-central1-b/pangeo-uscentral1b/details?project=pangeo-181919)
Did that resolve the issue?
I was eventually able to get workers even before raising this issue, but Ill keep an eye out in the upcoming days.
Quick update: Right now I am getting dask workers quickly! Thanks for the adjustment.