Lingering 3rd party project dependency on prow.k8s.io
See also the umbrella issue at https://github.com/kubernetes/k8s.io/issues/7708 and previous discussion at https://github.com/kubernetes/test-infra/issues/12863
Right now this represents an ambiguous policy gap where we only provide CI for Kubernetes subprojects ... except cadvisor and containerd. AFAICT all others are on their own non-kubernetes-provided CI now.
While the cost is likely not high, it is difficult to reason about a consistent policy currently, with these two small exceptions resolved we could pretty reasonably state that we provide CI for the Kubernetes project and its official subprojects, within reason (with some right reserved to deal with budget-busting usage and/or abuse e.g. "crypto mining").
Otherwise we need to explain coherent reasoning that does not open us up to hosting the entire landscape's CI. We already have a huge problem on our hands handling the 100s of Kubernetes subprojects, scale testing, content distribution, etc.
/sig k8s-infra /sig testing cc @kubernetes/sig-k8s-infra-leads @kubernetes/sig-testing-leads
@dims confirmed that cadvisor's prow CI is actually still not functional since the prow.k8s.io control plane migration from google.com to kubernetes.io (community) GCP project, so that leaves only github.comcontainerd/containerd and Kubernetes' own projects/subprojects.
Some discussion in https://kubernetes.slack.com/archives/CCK68P2Q2/p1737568484394929?thread_ts=1737566661.037439&cid=CCK68P2Q2
Do we want to include CRI-O given what some tests using the K8S Infrastructure ? https://testgrid.k8s.io/sig-node-cri-o
Do we want to include CRI-O given what some tests using the K8S Infrastructure ? https://testgrid.k8s.io/sig-node-cri-o
My understanding is that these are jobs that are not directly connected to the repo and have kubelet using a stable release so as to not run kubelet tests with a single CRI implementation (other jobs mostly use stable containerd)
That's different from jobs aimed at testing the development of a third party project (or worse requiring additional over-permissioning of our CI accounts etc for presubmit, or webhooks or), unless I've missed something.
If there are jobs testing cri-o development, I think that would need to be discussed here.
Similarly to create a cluster to test kubernetes we ultimately use other projects, but we are not operating CI for this projects.
(So also we would not remove any of those, but we would say, a job testing cilium @ HEAD, the difference is in which repo's changes are under test)
Right now this represents an ambiguous policy gap where we only provide CI for Kubernetes subprojects ... except cadvisor and containerd
The containerd prow jobs are specifically testing compatibility with Kubernetes using node e2e tests. We have separate CI for core containerd and for CRI using critest that runs in GitHub Actions via the containerd org (and funded by CNCF). I think if we want to migrate containerd completely off prow jobs, we'd need guidance on how to best run the node e2e tests elsewhere.
The containerd prow jobs are specifically testing compatibility with Kubernetes using node e2e tests.
Sure, but every other landscape project especially CNI CSI implementations would argue the same and yet we're not hosting CI for those repos (unless they are a subproject).
We have separate CI for core containerd and for CRI using critest that runs in GitHub Actions via the containerd org (and funded by CNCF). I think if we want to migrate containerd completely off prow jobs, we'd need guidance on how to best run the node e2e tests elsewhere.
SIG node should know best how to run node_e2e tests, but I think you could run these against a vagrant VM in actions as suggested by @upodroid in the slack thread.
I've opened https://github.com/containerd/containerd/issues/11486 to track the work on the containerd side.
but I think you could run these against a vagrant VM in actions as suggested by @upodroid in the slack thread.
Maybe make that limactl instead though. https://github.com/lima-vm/lima
edit: https://github.com/kubernetes-sigs/kind/blob/main/.github/workflows/vm.yaml is using this
Lima
I also created https://github.com/lima-vm/lima-actions to simplify the setup
steps:
- uses: actions/checkout@v4
- uses: lima-vm/lima-actions/setup@v1
id: lima-actions-setup
- uses: actions/cache@v4
with:
path: ~/.cache/lima
key: lima-${{ steps.lima-actions-setup.outputs.version }}
- run: limactl start --plain --name=default --cpus=1 --memory=1 template://fedora
- uses: lima-vm/lima-actions/ssh@v1
- run: rsync -a -e ssh . lima-default:/tmp/repo
- run: ssh lima-default ls -l /tmp/repo
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
I think this is primarily https://github.com/containerd/containerd/issues/11486
Testgrid will be tricky later, as that's only DNS from us currently.
#7710 was resolved, so this is the main outstanding issue with conflated infra after all of the migrations out of vendor owned accounts into k8s infra / CNCF owned
Opened https://github.com/containerd/containerd/pull/12028 to migrate the node e2e presubmit to GitHub Actions
Is the focus of this effort to just remove the containerd presubmit jobs?
From what I can tell, the periodic and postsubmit jobs are for building cached versions of containerd so they don't require rebuild on every CI run.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Is the focus of this effort to just remove the containerd presubmit jobs?
IMHO the focus was to totally decouple the infra from these projects, so we no longer have exceptions to sig k8s infra == infra for the Kubernetes orgs ... to avoid awkward/difficult carve-outs.
I've also stepped down as a TL though, so this is just my take on the original context.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
https://github.com/kubernetes/test-infra/pull/36004 landed recently to decouple cadvisor.
I think what's remaining is containerd.