Generative_Deep_Learning_2nd_Edition
Generative_Deep_Learning_2nd_Edition copied to clipboard
Suggestion: use Kubernetes with GKE Autopilot instead of VMs to run book examples on a cloud GPU
This repo provides instructions on how to set up GCP cloud VM instance with GPU to run examples. I would like to recommend to take it further and use GKE Autopilot for GPU workloads instead of VMs. Some benefits are:
- GKE Autopilot's pay-per-use model ensures cost efficiency. Applying workloads via kubectl apply is simple, and pod deletion when idle is effortless.
- Leverage service-based load balancing to expose Jupyter Lab, eliminating the need for port forwarding.
- Maintenance/upgrades are managed seamlessly by GKE Autopilot, freeing users from routine system upkeep.
- Adopting Kubernetes, a scalable and industry-standard platform, equips readers with practical experience, setting them ahead of a
docker compose
on a VM setup.
This is how I deployed the examples to GKE Autopilot:
- Build and deploy docker image:
IMAGE=<your_image> # you can also skip this step and use bulankou/gdl2:20230715 that I build
docker build -f ./docker/Dockerfile.gpu -t $IMAGE .
docker push $IMAGE .
- Create GKE Autopilot cluster with all default settings.
- Apply the following K8s manifest (
kubectl apply -f <yaml>
) . Make sure to update<IMAGE>
below. Also notecloud.google.com/gke-accelerator: "nvidia-tesla-t4"
andautopilot.gke.io/host-port-assignment
annotation, that ensure that we pick the right node type as well as enable host port on Autopilot.
apiVersion: v1
kind: Pod
metadata:
name: app
annotations:
autopilot.gke.io/host-port-assignment: '{"min":6006,"max":8888}'
labels:
service: app
spec:
nodeSelector:
cloud.google.com/gke-accelerator: "nvidia-tesla-t4"
containers:
- command: ["/bin/sh", "-c"]
args: ["jupyter lab --ip 0.0.0.0 --port=8888 --no-browser --allow-root"]
image: <IMAGE>
name: app
ports:
- containerPort: 8888
hostPort: 8888
- containerPort: 6006
hostPort: 6006
resources:
limits:
nvidia.com/gpu: 1
requests:
cpu: "18"
memory: "18Gi"
tty: true
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: app
spec:
type: LoadBalancer
ports:
- name: "8888"
port: 8888
targetPort: 8888
- name: "6006"
port: 6006
targetPort: 6006
selector:
service: app