coreos-assembler
coreos-assembler copied to clipboard
mantle/kola: copy basic tests into formal kola workflow
cmd-kola
has non-conformed basic kola tests. These tests do not benefit from the supporting functionality around kola test's such as denylisting or other functionality. reduce cmd-kola
and conform the basic kola tests.
fixes https://github.com/coreos/coreos-assembler/issues/3418
This is a small rework on an already existing; but stale pr. Cherry-picked and made a few touches.
hmm, I might have missed it but I did not see what triggers the CI to run kola -p qemu --build latest run --rerun --allow-rerun-success=tags=needs-internet --on-warn-failure-exit-77 --arch=x86_64 --basic-qemu-scenarios
If someone could point me to it I can make a pr and remove it.
hmm, I might have missed it but I did not see what triggers the CI to run
kola -p qemu --build latest run --rerun --allow-rerun-success=tags=needs-internet --on-warn-failure-exit-77 --arch=x86_64 --basic-qemu-scenarios
If someone could point me to it I can make a pr and remove it.
We also need to touch coreos-ci-lib
hmm, I might have missed it but I did not see what triggers the CI to run
kola -p qemu --build latest run --rerun --allow-rerun-success=tags=needs-internet --on-warn-failure-exit-77 --arch=x86_64 --basic-qemu-scenarios
If someone could point me to it I can make a pr and remove it.We also need to touch coreos-ci-lib
awesome; made a pr. thank you
Trying to summarize what we'll run:
x86_64 | aarch64 | ppc64le | s390x | |
---|---|---|---|---|
boot type | bios/uefi/uefi-secure | bios | bios | bios |
nvme | yes | no | no | no |
Number of combination | 6 | 1 | 1 | 1 |
So the question is: Should we really run all 6 combination for the test matrix on x86_64 or just a subset of those for nvme? For now we were only running basic (nvme), uefi, uefi-secure as Michael said.
Shouldn't we run uefi & uefi-secure boot on aarch64 as well? Feels like that should be working.
EDIT: Re-reading the PR, this runs all 6 ones for x86_64 so should be OK.
Re-reading the PR, this runs all 6 ones for x86_64 so should be OK
I'm good with that. It makes sense that we would test bios, uefi, and uefi-secure with both nvme and non-nvme conditions on x86_64.
So we just need to make sure that the nvme and uefi/uefi-secure tests dont run on other arches.
Shouldn't we run uefi & uefi-secure boot on aarch64 as well? Feels like that should be working.
Unless we want these ones to run on aarch64.
So we just need to make sure that the nvme and uefi/uefi-secure tests dont run on other arches.
If thats the case It sounds like I need to update the field Architectures
to only run x86 on all but bios?
If thats the case It sounds like I need to update the field Architectures to only run x86 on all but bios?
Yes, I believe that should do it!
So overall I think there are at least two main ways I see we can approach this:
1. Have knobs as part of `register.Test` to say e.g. which firmware to use for the test. 2. Use `ClusterSize: 0` and manually bring up the VMs ourselves. This is done in a bunch of QEMU tests already where we need more control over the VM settings. (Do a `git grep 'ClusterSize: \+0'` to see them.)
Leaning towards the latter to avoid having to add a bunch more knobs in
register.Test
. You can have a single function that sets up the tests and takes arguments for the variable knobs and each test calls that one function with the different combinations we want (see theluks.*
tests for an example of this pattern).Re. the combinations, my vote is to match the current coverage and then we can extend it in the future. But not strongly against either!
Ok thank you for the pointer. I am looking into doing this.
@jlebon thank you for your review, and thats a fair question, I did not check it when I pushed. It was before; but I need to double check.
Thank you for your time and feedback, and I will try and get to this in the next week or so.
RHCOS CI will be failing for this PR because we remove the option it uses so this is expected.
Ok locally all tests are passing :)
This looks great!
Thank you for all of your help and back and forth on this review, its been a learning experience for me, and Its been interesting to learn more about kola.
Another safer option is to change coreos-ci-lib to check if
kola list
has e.g.basic.nvme
(or thatcosa kola --help
doesn't have--basic-qemu-scenarios
). This is a signal that it's working with a cosa that has this PR. If yes, then use the new logic proposed in coreos/coreos-ci-lib#154. If no, then use the old logic. Then we can merge this PR, verify it works well, etc... and work on backporting it to older branches at our own pace. Once all backports are done, we clean up the old logic entirely from coreos-ci-lib.
Alright lets def opt for this, I can see about making changes to the coreos-ci-lib pr here shortly.
Another PR to prepare is the update for the RHCOS CI script but it should be a fairly small change.
https://github.com/openshift/os/blob/master/ci/prow-entrypoint.sh#L126
Since https://github.com/coreos/coreos-ci-lib/pull/154 just landed I should be able to re-run these test and get it passing.
@travier Thank you pr here => https://github.com/openshift/os/pull/1458
/retest
and work on backporting it to older branches at our own pace. Once all backports are done, we clean up the old logic entirely from coreos-ci-lib.
I talked to Steven about this yesterday. If the backports apply relatively easily, let's do that. Otherwise, it's fine also to carry that coreos-ci-lib logic for now until we fully drop older streams that are not worth backporting to.
/retest
Pushed a trivial commit to fix the Prow CI issue.
/approve /lgtm
/retest
Looked like the other commit might not had made it to the pipeline before the retest had started as we got some funk. Retesting to see.
@jlebon thank you!
@jlebon why would prow have this error --basic-qemu-scenarios, it seems odd as our changes have landed in https://github.com/coreos/coreos-ci-lib/pull/154
/test ci/prow/images
@travier: The specified target(s) for /test
were not found.
The following commands are available to trigger required jobs:
-
/test images
-
/test rhcos
Use /test all
to run all jobs.
In response to this:
/test ci/prow/images
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/test images
Needs https://github.com/coreos/coreos-assembler/pull/3775
/retest
I think the bug that was breaking CI here should be fixed by https://github.com/openshift/os/pull/1478. We just need to get https://github.com/coreos/coreos-assembler/pull/3775 in, and then this should be unblocked.