server Unable to create cluster with single t4 gpu , 2 core 12 gb ram in GKE

Unable to create cluster with single t4 gpu , 2 core 12 gb ram in GKE

Open swapnil-lader opened this issue 2 years ago • 1 comments

I am trying to deploy the triton server into the GKE cluster , my node pool configuration file: gcloud container node-pools create test
--project project-id
--zone us-central1-c
--cluster test-cluster
--num-nodes 1
--accelerator type=nvidia-tesla-t4,count=1
--enable-autoscaling --min-nodes 1 --max-nodes 3
--machine-type=custom-2-12288
--disk-size=100
--scopes cloud-platform
--verbosity error but after configuring everything and deploying the triton server when i run kubectl describe pods triton-server.

Events: Type Reason Age From Message

Warning FailedScheduling 21s gke.io/optimize-utilization-scheduler 0/2 nodes are available: 1 Insufficient cpu, 1 Insufficient nvidia.com/gpu, 2 Insufficient memory.

it throwing this error for not launching a pods , Is it mandatory to have n1-standard-4 as minimum node pool spec for this scenario.?

Or am missing anything in creating my node pool inside cluster.

Sep 01 '22 10:09 swapnil-lader

Hi @swapnil-lader, unfortunately this is a bit outside of our scope as this isn't actually a Triton issue. One thing that is just off the top of my head that I think you can try is to run nvidia-smi command to see your GPU utilization. Not sure if this helps but I'm seeing some similar issue that might be helpful for your case.

Sep 07 '22 23:09 krishung5

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue.

Oct 13 '22 00:10 krishung5

server server copied to clipboard

Unable to create cluster with single t4 gpu , 2 core 12 gb ram in GKE

server
server copied to clipboard