Daniel Vega-Myhre

Results 49 issues of Daniel Vega-Myhre

Small bug fix where `true` should be `false`.

## Description The purpose of these changes is to add the k8s manifests corresponding to our guide in the public docs on [Orchestrating TPU MultiSlice workloads using JobSet and Kueue](https://cloud.google.com/kubernetes-engine/docs/tutorials/tpu-multislice-kueue)....

I recently submitted a PR updating the image digest for a tag: https://github.com/kubernetes/k8s.io/pull/5220 However, I noticed the image digest for registry.k8s.io/jobset/jobset:v0.1.1 never updated, and after discussion with @BenTheElder I learned...

help wanted
kind/bug
sig/release
lifecycle/frozen
sig/k8s-infra

We would like to publish a blog post introducing [JobSet](https://jobset.sigs.k8s.io/), a K8s native API for distributed ML training and HPC workloads. cc @ahg-g @kannon92 I think we still need to...

cncf-cla: yes
size/L
do-not-merge/work-in-progress
sig/docs
do-not-merge/hold
language/en
area/blog

- One-line PR description: Configurable Job failure reason for PodFailurePolicy - Issue link: #4443 - Other comments:

sig/apps
cncf-cla: yes
lgtm
size/XXL
kind/kep
wg/batch

### Enhancement Description - One-line enhancement description (can be used as a release note): Add an optional `Reason` field to the Job `PodFailurePolicyRule`, which allows the user to specify the...

sig/apps
stage/alpha
tracked/yes
wg/batch
lead-opted-in

### Enhancement Description - One-line enhancement description (can be used as a release note): Add Pod Index Label for StatefulSets and Indexed Jobs. - Kubernetes Enhancement Proposal: - Discussion Link:...

sig/apps
stage/stable
lead-opted-in

**What would you like to be added**: A comprehensive example showing how to run a training workload on GPUs using JobSet. We could have one example per major cloud provider....

good first issue

**What would you like to be added**: Integration tests for changes in #562 **Why is this needed**: Improving test coverage

help wanted

**What would you like to be added**: Add support for feature gates in JobSet, like are used in upstream k8s and Kueue [here](https://github.com/kubernetes-sigs/kueue/blob/main/pkg/features/kube_features.go). **Why is this needed**: - Allow customer...

kind/feature