xpk icon indicating copy to clipboard operation
xpk copied to clipboard

xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.

Results 33 xpk issues
Sort by recently updated
recently updated
newest added

## Fixes / Features Create the `xpk` package. Created the `utils` module, which now contains utils function from `xpk.py` ## Testing / Documentation Tested manually, installing xpk and running it....

## Fixes / Features - - ## Testing / Documentation Testing details. - [ y/n ] Tests pass - [ y/n ] Appropriate changes to documentation are included in the...

I wonder if `xpk` supports a cluster creation from several reservations. If not, do you have any plans to add this feature?

…eation / deletion ## Fixes / Features - - ## Testing / Documentation Testing details. - [ y/n ] Tests pass - [ y/n ] Appropriate changes to documentation are...

We have a CPU-only cluster -- n2-standard-32-1024 that has 1024 of n2-standard-32 nodes. There, we technically should have a rough 1024 * 32 CPU resources but I'm seeing 1024 nominated...

If there is an upgrad available for the cluster, additional text is printed to stderr ``` * - There is an upgrade available for your cluster(s). To upgrade nodes to...

## Fixes / Features Add xpk info command. There are two subcommands: xpk info localqueues, xpk info clusterqueues, for both of them there is --cluster flag supported. ## Testing /...

## Fixes / Features - Creates Storage CRD that allows attaching existing GCS Bucket to workloads ## Testing / Documentation Tested manually, CI/CD tests in progress - [ y ]...

kueue supports all-or-nothing scheduling: https://kueue.sigs.k8s.io/docs/tasks/manage/setup_wait_for_pods_ready/ Large multi-pod workloads that need every pod to be running to make progress (e.g. single-program-multi-data workloads) can deadlock capacity if the physical availability of resources...

## Fixes / Features - Fixes workload rendering when using spot, without this change xpk workload create errors like: ``` [XPK] Waiting for `Creating Workload`, for 0 seconds error: error...