kairos
kairos copied to clipboard
🐛 Take Over installation fails due to "undefined system source to install" on bare metal
Issue
With bare metal, steps from https://kairos.io/docs/installation/takeover/ fails after pulling the $IMAGE. This was tested with OpenSuSE MicroOS as the host:
undefined system source to install
Steps
$ whoami
root
$ lsblk /dev/sdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdb 8:16 0 111.8G 0 disk
export DEVICE=/dev/sdb
export IMAGE=quay.io/kairos/core-opensuse-leap:v2.1.0
cat <<'EOF' > config.yaml
#cloud-config
users:
- name: "kairos"
passwd: "kairos"
ssh_authorized_keys:
- github:mudler
EOF
export CONFIG_FILE=config.yaml
$ docker run --privileged -v $PWD:/data -v /dev:/dev -ti $IMAGE kairos-agent manual-install --device $DEVICE /data/$CONFIG_FILE
INFO[2023-06-19T09:42:17Z] Starting elemental version v2.0.1
undefined system source to install
Notes
/dev/sdbis a clean GPT disklabel with no partitions
Hi @GrabbenD,
can you try with older version of kairos? is this a regression? The last time I've tried it was on the 1.x branch. Can you try with 1.x and letting us know?
@mudler
Yupp I'm seeing the same issue with v1.6.0 and I just tried the last version, v2.3.0
For reference, this issue can also be reproduced with kairos-agent interactive-install instead of kairos-agent manual-install
$ podman run --privileged -v $PWD:/data -v /dev:/dev -ti quay.io/kairos/core-opensuse-leap:v1.6.0 kairos-agent interactive-install
<...>
┌───────────────────────────────────────────────────────────────────────────┐
| Interactive installation. Documentation is available at https://kairos.io.|
└──────────────────────────────────────────────────────────── Installation ─┘
INFO Available Disks:
INFO /dev/sda: unknown (300.00 GiB)
INFO /dev/sdb: unknown (1863.02 GiB)
What's the target install device? /dev/sdb
User to setup kairos
Password ●●●●●●
SSH access (rsakey, github/gitlab supported, comma-separated)
Are settings ok? Y
INFO Starting installation
INFO #cloud-config
install:
device: /dev/sdb
name: Config generated by the installer
stages:
network:
- users:
kairos:
groups:
- admin
name: kairos
passwd: kairos
ssh_authorized_keys:
- ""
INFO[2023-07-19T15:06:06Z] Starting elemental version 0.20230222.1+kairos
ERRO[2023-07-19T15:06:06Z] invalid install command setup undefined system source to install
Error: undefined system source to install
exit status 1
$ podman run --privileged -v $PWD:/data -v /dev:/dev -ti quay.io/kairos/core-opensuse-leap:v2.3.0 kairos-agent interactive-install
<...>
┌───────────────────────────────────────────────────────────────────────────┐
| Interactive installation. Documentation is available at https://kairos.io.|
└──────────────────────────────────────────────────────────── Installation ─┘
INFO Available Disks:
INFO /dev/sda: unknown (300.00 GiB)
INFO /dev/sdb: unknown (1863.02 GiB)
What's the target install device? /dev/sdb
User to setup kairos
Password ●●●●●
SSH access (rsakey, github/gitlab supported, comma-separated)
Are settings ok? Y
INFO Starting installation
INFO #cloud-config
install:
device: /dev/sdb
name: Config generated by the installer
stages:
initramfs:
- users:
kairos:
groups:
- admin
name: kairos
passwd: kairo
INFO[2023-07-19T15:22:44Z] kairos-agent version v2.1.10
ERROR undefined system source to install
undefined system source to install
$ podman run --privileged -v $PWD:/data -v /dev:/dev -ti quay.io/kairos/core-opensuse-leap:v1.6.0 kairos-agent manual-install --device $DEVICE /data/$CONFIG_FILE
INFO[2023-07-19T15:19:42Z] Starting elemental version 0.20230222.1+kairos
ERRO[2023-07-19T15:19:42Z] invalid install command setup undefined system source to install
Error: undefined system source to install
exit status 1
$ podman run --privileged -v $PWD:/data -v /dev:/dev -ti quay.io/kairos/core-opensuse-leap:v2.3.0 kairos-agent manual-install --device $DEVICE /data/$CONFIG_FILE
INFO[2023-07-19T15:19:15Z] kairos-agent version v2.1.10
undefined system source to install
This issue can be observed on bare metal as well as in a Hyper-V VM. I've tested Podman and Docker
For clarification, the host system is Arch Linux (while testing v2.3.0 and v1.6.0 with Podman and Docker today). In the past I've also tried with OpenSuSE MicroOS ^
I have this issue as well, trying with a few current (2.4.3) images. I also tried specifying an oci source but was unsure if I was doing so correctly - that got through partitioning, but failed with a zero length image.
ok, yeah, other medias mount the source of the install into /run/rootfsbase
On docker this is not working, because there is no "boot" process per se, so nothing mounts the media (iso,usb,pxe) into that dir, so the agent thinks that its empty.
There is 2 ways of fixing this.
- Link the /run/rootfsbase to / in the images, it should be fine once you boot as it gets remounted via dracut
- Make the source for install configurable via the config file/flag and try first to use that, then falldback to the default iso tree then just fail.
For a workaround the fist option is the easiest and can be applied directly by building your own Dockerfile(example with quay.io/kairos/ubuntu:23.10-core-amd64-generic-v2.5.0):
FROM quay.io/kairos/ubuntu:23.10-core-amd64-generic-v2.5.0
RUN ln -s / /run/rootfsbase
Build that locally docker build -t quay.io/kairos/ubuntu:23.10-core-amd64-generic-v2.5.0-custom . and use that as your $IMAGE on the takeover docs:
export DEVICE=/dev/sda
export IMAGE=quay.io/kairos/ubuntu:23.10-core-amd64-generic-v2.5.0-custom
cat <<'EOF' > config.yaml
#cloud-config
users:
- name: "kairos"
passwd: "kairos"
ssh_authorized_keys:
- github:mudler
EOF
export CONFIG_FILE=config.yaml
docker run --privileged -v $PWD:/data -v /dev:/dev -ti $IMAGE kairos-agent manual-install --device $DEVICE /data/$CONFIG_FILE
I tested this myself and could do a takeover install.
The other one needs fixes directly to the agent so that will take a bit more of time to trickle down.
Can also be fixed by setting --source flag so it may be easier xDDD
with --source flag it fails to calculate the size...
We need to implement a test to make sure we don't break it again in the future.
This seems to be a mess to implement, maybe we should do it on the agent by installing from a docker image into a loop device and then checking that loop device instead?
In here would mean running from a vm with an iso with docker, running the agent from the container inside that livecd and then rebooting to check the state.
Where are we getting that livecd able to ssh with same credentials as kairos which has docker already running from?
IMHO, should move to agent directly
I think peg allows us to use different user and password to ssh to the VM. The question is where do we find the non-Kairos ISO. I had a similar problem while trying to implement this: https://github.com/kairos-io/kairos/issues/2182
I tried to find a very small livecd that I could even check in git but they were either too big or not working for other reasons.
If you think installing to a loop device is going to work, let's do that.