typewriter Distributed coach stalls if number of workers is greater than number of available vCPUs.

Distributed coach stalls if number of workers is greater than number of available vCPUs.

Open geranim0 opened this issue 5 years ago • 2 comments

In order for k8s not to put all pods on the same node I gave resource restrictions on pods with this code in kubernetes_orchestrator.py

resources=k8sclient.V1ResourceRequirements( requests={'cpu':'1'} ),

It works if I give a num_workers < vCPUs, stalls otherwise since there are pending pods to be created. Is this by design with the worker locks? What's the recommended approach?

Apr 30 '19 17:04 geranim0

typewriter typewriter copied to clipboard

Distributed coach stalls if number of workers is greater than number of available vCPUs.

typewriter
typewriter copied to clipboard