kubernetes
kubernetes copied to clipboard
[WIP] Booting/testing something in CI
Very incomplete right now, will work on it as time allows.
This is hitting issues with the linuxkit build
getting randomly SIGKILL
'd, I think it is getting squashed by the OOM killer and raised https://github.com/moby/tool/pull/191 to reduce the memory overheads. I'm also experimenting with the resource_class
to see if that helps in the meantime (as a short term band-aid, although it looks like it will become a paid only feature before too long).
Early indications are that medium+
(3 CPU 6GB RAM) is insufficient while large
(4 CPU 8GB RAM) is enough.
In https://github.com/moby/tool/pull/191 I observed the initial RAM usage before my changes was 6.7GB which seems consistent with getting SITH on a 6GB limit. I also noted that tar
used more like 2GB, which explains why they mostly work since the default medium
has 4GB RAM, although they do still fail occasionally so perhaps there can be spikes or differences around other factors like content trust.
Please sign your commits following these rules: https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work The easiest way to do this is to amend the last commit:
$ git clone -b "ci-boot-something" [email protected]:ijc/linuxkit-kubernetes.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354248512
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f
Amending updates the existing PR. You DO NOT need to open a new one.
Booting using qemu in CI (so no KVM) hits a hardcoded 30min timeout in kubeadm init
. In any case 30min is far far far too long (tha'ts on top of a minute or so to boot and another for sshd to actually start).
After #35 gets merged, we will have a way of running e2e tests. Making all tests pass is a separate problem, but it'd be nice to incorporate that into nightly/weekly job (currently I'm seeing that it takes about 30min on my laptop, but we skip a fair chunk of tests).
In any case 30min is far far far too long (tha'ts on top of a minute or so to boot and another for sshd to actually start).
I'd expect CircleCI to also timeout in the time frame similar to 30min. Besides that, the timing you've quoted would be unbearable if we were to run even a subset of the e2e suite.
Large VM with nested virtualisation would mean that we don't need to upload anything and seems generally easier to start with, but if upload speed and image storage costs are negligible, then linuxkit push
+linuxkit run
is probably sufficient. Although the need for clustering would require extra configuration (e.g. VPC etc), so one beefy box seems easier again and potentially a little easier to port (as KVM seems like the lowest common denominator).
...so one beefy box seems easier again and potentially a little easier to port (as KVM seems like the lowest common denominator).
I am mostly thinking of someone building up LinuxKit CI for their own projects, not so much about us moving CI from one place to another.
We have no MacOS workers in circle (seems it is a paid only feature) so the jobs just hang forever waiting to be assigned.
It's been suggested we could try using the same gcp account as the linuxkit/linuxkit CI and use linuxkit run gcp
. Need to take care not to leak VMs though and to thoroughly cleanup.
It's been suggested we could try using the same gcp account as the linuxkit/linuxkit CI and use linuxkit run gcp. Need to take care not to leak VMs though and to thoroughly cleanup.
This makes sense. The only concern that I have is how long would it take to upload a build from Circle to GCP... We would also have to cleanup image, as I'm pretty sure they charge for storing them. The alternative would be to build and run on a VM that supports nested virt. IIRC there is a new instance type GCP that supports nested virt.
@rn is experimenting with the nested virt stuff on the main linuxkit repo now, will wait and see how he gets on.
It might indeed be too much data to upload for each PR, may turn out better to deploy a VM to build the images whether we then test them nested or in a new VM altogether.
Also, if use nested virt, it means someone can reproduce the tests locally very easily, we could even provide a generalised script for folks to re-use in private LinuxKit projects. Just reiterating what I've said earlier...
https://github.com/linuxkit/linuxkit/pull/2871 has the right runes for enabling nested virt. The image needs to have a "special license" (some string in the Licenses field) and the instance needs to be Haswell or newer.