image-builder
image-builder copied to clipboard
Flatcar may inadvertently update during image creation
In https://github.com/kubernetes-sigs/image-builder/pull/701/commits/c73b6e7a28b93dd03a747a8454600838e656ff5c we've disabled Flatcar updates. However, this change leaves a time gap which still allows Flatcar to download updates and reboot during image creation, resulting in either a broken build or an image with an unexpected version.
Following is a high-level description of the Flatcar image build process for ISO-based Packer builds (e.g. QEMU, OVA):
- We boot Flatcar from an official ISO or base image (e.g. AMI) of a specific Flatcar release.
- We execute
flatcar-install
while passing an Ignition file and reboot. - Flatcar boots from disk.
- Ansible is executed.
- An image is created from the provisioned machine and the machine is terminated.
Currently, Flatcar may inadvertently update between stage 3 and 5 (inclusive). In order to prevent this, we need to disable updates in the Ignition config we pass to flatcar-install
at stage 2.
NOTE: I've confirmed this bug exists for OVA builds, however it could apply also for AWS and Azure builds: Although these builds aren't based on booting from ISO, there is still a phase where the temporary VM is running while Ansible is executing. We should double-check updates are disabled during that phase, too.
TODO
- [ ] Ensure updates are disabled throughout the entire build process as well as the final image for all supported providers.
- [ ] AMI
- [ ] Azure (SIG + VHD)
- [x] OVA - handled by #895
- [x] QEMU - handled by #895
- [x] Raw - handled by #895
/kind bug /assign
Fixed in https://github.com/flatcar-linux/flatcar-packer-qemu/pull/5. Once merged, we need to update the following:
https://github.com/kubernetes-sigs/image-builder/blob/2e85b15e9e75f5a194eb0c4c3f21c47918281a41/images/capi/packer/qemu/qemu-flatcar.json#L3
https://github.com/kubernetes-sigs/image-builder/blob/2e85b15e9e75f5a194eb0c4c3f21c47918281a41/images/capi/packer/raw/raw-flatcar.json#L3
Following https://github.com/kubernetes-sigs/image-builder/pull/873#discussion_r866007976, looks like the solution to this issue depends on https://github.com/kubernetes-sigs/image-builder/issues/890.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Confirmed that updates are enabled for AMIs:
core@ip-172-31-5-218 ~ $ systemctl status update-engine
● update-engine.service - Update Engine
Loaded: loaded (/usr/lib/systemd/system/update-engine.service; disabled; vendor preset: disabled)
Active: active (running) since Fri 2023-03-17 15:10:15 UTC; 3min 12s ago
Main PID: 1351 (update_engine)
Tasks: 2 (limit: 15114)
Memory: 10.4M
CPU: 80ms
CGroup: /system.slice/update-engine.service
└─1351 /usr/sbin/update_engine -foreground -logtostderr
core@ip-172-31-5-218 ~ $ systemctl status locksmithd
● locksmithd.service - Cluster reboot manager
Loaded: loaded (/usr/lib/systemd/system/locksmithd.service; disabled; vendor preset: disabled)
Active: active (running) since Fri 2023-03-17 15:10:15 UTC; 4min 13s ago
Main PID: 1571 (locksmithd)
Tasks: 6 (limit: 15114)
Memory: 16.1M (limit: 32.0M)
CPU: 16ms
CGroup: /system.slice/locksmithd.service
└─1571 /usr/lib/locksmith/locksmithd
We need to mask these units during image creation.
Fix in https://github.com/kubernetes-sigs/image-builder/pull/1150.