kubespray icon indicating copy to clipboard operation
kubespray copied to clipboard

CI: rework pipeline: short/extended based on labels

Open ant31 opened this issue 1 year ago • 6 comments

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

  • The CI VMs are requesting more resources than they are using. It's filling up the servers quickly
  • Too many VMs are created per PR
  • Vagrant and Molecule jobs are using deprecated/archived docker-machine project
  1. This PR reduces the VM resource requests to help pack more VMs per server.
  2. Reduce the number of jobs that are triggered automatically on each PR.
  3. To run additional tests, it's possible to label the pr with "ci-extended", which should be equivalent to the previous pipeline
  4. To run a shorter pipeline : add the label ci-short
  5. Vagrant and Molecule jobs are running inside Kubevirt VM
  6. pre-commit runs in a single job: reduce resources usage without increasing time

Which issue(s) this PR fixes:

Fixes #8786

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

ant31 avatar Jun 25 '24 12:06 ant31

/retest

ant31 avatar Jun 25 '24 15:06 ant31

Hey! This kind of approach seems quite sensible to me, but what the policy should be here? Any code change we put the extended test (meaning not a doc only change for instance)? Or is it depending on if the reviewers (as of the persons who is reviewing a PR not the role reviewer on the repo) of a PR think it needs the extended test or not?

Also the kubespray reviewers who are not approvers can't edit labels I believe or maybe that would be just me, not sure :man_shrugging:.

MrFreezeex avatar Jun 25 '24 18:06 MrFreezeex

@MrFreezeex it introduces the mechanism to do it. How it is used is up for discussion and the threasold can move easily. At least, documentation PR should not trigger massive pipeline.

Which label trigger what can be changed too. For example, it could trigger the extended when there's a lgtm, or size/L , or area/xyz

In this PR the default pipeline is close to the previous one (removed few redundant test imo) so it's not a big change.

  • Ci-short (manual)
  • Ci-long (default)
  • Ci-extended (manual) As mentioned can discuss what is default what is not.

About applying labels, yes it would be the role of the reviewer to determine it. PR owner could self assess too. For the permissions, we can try to use prow /label or an equivalent bot.

ant31 avatar Jun 25 '24 20:06 ant31

Beside the labels, important change is that there's no more stages, but job dependencies. It allows to streamline the job. It's not waiting the last job of stage 1 before starting the ones on stage2.

As a result, those new pipelines runs in ~25min.

https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/pipelines/1347530306

ant31 avatar Jun 26 '24 08:06 ant31

@MrFreezeex @VannTen @yankay

This is needed to use the new infrastructure, should make CI go through faster, and hopefully improve contributors' experiences. Both infra are up and running, so we'll have to tear down the previous one soon.

The main blocking point could be the job that runs by default. a) that can be changed easily; please comment on which jobs you want to add, or we can also do follow-up PR. b) we can't test every combination of every network+container-engine+os. The tradeoff was to test at least once every option, but not the combination of them.

I tried to remove from the default pipeline only a few: for example, Instead of :

  • ubuntu-calico
  • centos-calico,
  • ubuntu-flannel
  • centos-flannel

I go for:

  • ubuntu-calico
  • centos-flannel

We can still run full tests on schedule (daily) and for tags/releases, but not on every PR.

ant31 avatar Jun 29 '24 09:06 ant31

Thanks @ant31 /approve

yankay avatar Jul 01 '24 10:07 yankay

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ant31, MrFreezeex, yankay

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Jul 01 '24 10:07 k8s-ci-robot

/lgtm

yankay avatar Jul 01 '24 10:07 yankay

/cherrypick release-2.25

tico88612 avatar Jul 07 '24 03:07 tico88612

@tico88612: #11324 failed to apply on top of branch "release-2.25":

Applying: CI: reduce VM resources requests to improve scheduling
Applying: CI: Reduce default jobs; add labels(ci-full/extended) to run more test
Using index info to reconstruct a base tree...
M	.gitlab-ci/packet.yml
Falling back to patching base and 3-way merge...
Auto-merging .gitlab-ci/packet.yml
Applying: CI: use jobs dependencies instead of stages
Using index info to reconstruct a base tree...
M	.gitlab-ci.yml
M	.gitlab-ci/lint.yml
M	.gitlab-ci/packet.yml
Falling back to patching base and 3-way merge...
Auto-merging .gitlab-ci/packet.yml
Auto-merging .gitlab-ci/lint.yml
CONFLICT (content): Merge conflict in .gitlab-ci/lint.yml
Auto-merging .gitlab-ci.yml
CONFLICT (content): Merge conflict in .gitlab-ci.yml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0003 CI: use jobs dependencies instead of stages
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-2.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

/cherrypick release-2.25

tico88612 avatar Jul 20 '24 02:07 tico88612

@tico88612: #11324 failed to apply on top of branch "release-2.25":

Applying: CI: reduce VM resources requests to improve scheduling
Applying: CI: Reduce default jobs; add labels(ci-full/extended) to run more test
Applying: CI: use jobs dependencies instead of stages
Using index info to reconstruct a base tree...
M	.gitlab-ci.yml
Falling back to patching base and 3-way merge...
Auto-merging .gitlab-ci.yml
CONFLICT (content): Merge conflict in .gitlab-ci.yml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0003 CI: use jobs dependencies instead of stages
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-2.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

/cherrypick release-2.25

tico88612 avatar Aug 07 '24 13:08 tico88612

@tico88612: new pull request created: #11424

In response to this:

/cherrypick release-2.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.