xpk
xpk copied to clipboard
xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
## Fixes / Features - Fixes problem with GKE nodes version selection if the current GKE cluster version is no longer a valid GKE rapid version. ## Testing / Documentation...
cpu and memory limits are hard-coded: https://github.com/google/xpk/blob/main/src/xpk/core/nap.py#L170 We have a customer who needed to increase these. They needed to go to the GKE API level to do so.
## Features upload more detailed info to gcs bucket in failures Switch to `gcloud storage cp`, since it around twice as fast as `gsutil -m cp`. ## Testing / Documentation...