kairos icon indicating copy to clipboard operation
kairos copied to clipboard

kairos-agent 2.4.3 can not build rpi4 image

Open Ognian opened this issue 1 year ago • 17 comments

When building with the latest kairos-agent for rpi4 (2.4.3 via osbuilder:latest (v0.10.2), the MBR can't be written. Reverting osbuilder to v0.10.1 and therefore using kairos-agent 2.4.1 at least no error is shown. At the moment the rpi4 is still not able to boot and reports "Firmware not found" at the pre efi screen (grub is not reached). Any ideas?

Ognian avatar Dec 21 '23 16:12 Ognian

It's the version we are using on master:

https://github.com/kairos-io/kairos/blob/master/Earthfile#L24-L25

How are you running it?

mauromorales avatar Dec 21 '23 17:12 mauromorales

via :latest

docker run -v $PWD:/HERE -v /var/run/docker.sock:/var/run/docker.sock --privileged -i --rm --entrypoint=/build-arm-image.sh quay.io/kairos/osbuilder-tools:latest \
 --model rpi4 \
 --state-partition-size 6200 \
 --recovery-partition-size 4200 \
 --size 15200 \
 --images-size 2000 \
 --local \
 --config /HERE/cloud-config.yaml \
 --efi-dir /HERE/boot \
 --docker-image kairos-rpi-ogi /HERE/build/out.img

this is at the moment the only way how I can build an image on the mac...

Ognian avatar Dec 21 '23 18:12 Ognian

now I'm using v0.10.1 instead of :latest but this is the result: image image

Ognian avatar Dec 21 '23 19:12 Ognian

further I noticed: Pasted Graphic and fixed it via: Pasted Graphic 1 BUT it doesn't help Looks like beside the the problem that the disk is not properly formatted there is another problem with MBR for rpi or so....

Ognian avatar Dec 22 '23 09:12 Ognian

the first partition (EFI) is empty...

Ognian avatar Dec 22 '23 09:12 Ognian

OK if just tested with the prebuild rpi4 image

export IMAGE=quay.io/kairos/opensuse:leap-15.5-standard-arm64-rpi4-v2.4.3-k3sv1.28.2-k3s1-img
docker run -ti --rm -v $PWD:/image quay.io/luet/base util unpack "$IMAGE" /image

The partition table is also screwed up (check with fdisk -l) and has to be fixed via parted /dev/sda unit s print on a linux machine BUT the first partition is properly populated and therefore rpi can boot.

So there are 2 problems:

  • there is a difference on how the image is build in CI and how the custom image is build via build-arm-image.sh
  • there is a common problem on how the partition table is written

Hope this helps to narrow it down.

Ognian avatar Dec 22 '23 12:12 Ognian

I'm trying the command @Ognian is using with some latest image:

docker run -v $PWD:/work -v /var/run/docker.sock:/var/run/docker.sock --privileged -i --rm --entrypoint=/build-arm-image.sh quay.io/kairos/osbuilder-tools:latest  --model rpi4  --state-partition-size 6200  --recovery-partition-size 4200  --size 15200  --images-size 2000    --config /work/cloud-config.yaml  --efi-dir /work/boot  --docker-image quay.io/kairos/ubuntu:23.10-arm64-rpi4-master  /work/build/out.img

and I'm getting this error:

...
+ cp -rfv /tmp/arm-builder.t7U3Muskbc/etc/kairos/branding/grubmenu.cfg /tmp/grubmeny.cfg.5gvXju
cp: cannot stat '/tmp/arm-builder.t7U3Muskbc/etc/kairos/branding/grubmenu.cfg': No such file or directory
...

originating here: https://github.com/kairos-io/osbuilder/blob/e4482ddc088e8141a705c9c24f86775abef0a6db/tools-image/build-arm-image.sh#L304

jimmykarily avatar Jan 02 '24 14:01 jimmykarily

I'm stupid, that's a "generic" image I pulled :facepalm: . Trying again...

jimmykarily avatar Jan 02 '24 14:01 jimmykarily

ok this one succeeds:

docker run -v $PWD:/work -v /var/run/docker.sock:/var/run/docker.sock --privileged -i --rm --entrypoint=/build-arm-image.sh quay.io/kairos/osbuilder-tools:latest  --model rpi4  --state-partition-size 6200  --recovery-partition-size 4200  --size 15200  --images-size 2000    --config /work/config.yaml    --docker-image quay.io/kairos/ubuntu:23.04-standard-arm64-rpi4-v2.4.3-k3sv1.27.6-k3s1  /work/build/out.img

I had to remove --efi-dir because I don't know what @Ognian has in --efi-dir /HERE/boot. I'll try to find what this directory is supposed to have.

jimmykarily avatar Jan 02 '24 14:01 jimmykarily

I used the contents of this zip file: https://github.com/pftf/RPi4/releases/tag/v1.35 as my --efi-dir and I built an image with this command:

docker run -v $PWD:/work -v /var/run/docker.sock:/var/run/docker.sock --privileged -i --rm --entrypoint=/build-arm-image.sh quay.io/kairos/osbuilder-tools:latest  --model rpi4  --state-partition-size 6200  --recovery-partition-size 4200  --size 15200  --images-size 2000    --config /work/config.yaml  --efi-dir /work/boot  --docker-image quay.io/kairos/ubuntu:23.04-standard-arm64-rpi4-v2.4.3-k3sv1.27.6-k3s1  /work/build/out.img

Then I flashed the image with:

cat build/out.img | sudo dd of=/dev/sda oflag=sync status=progress bs=10MB

and booted a raspberry pi 4 with the card. I got /sys/firmware/efi populated (means I'm booted with efi). I don't get any warnings when I do fdisk -l /dev/mmcblk0 . Everything seems to work fine (k3s also starts).

@Ognian are you not booting from an sdcard? Why does yours appear as /dev/sda ? Also, what are the contents of your --efi-dir ?

jimmykarily avatar Jan 02 '24 15:01 jimmykarily

@jimmykarily if memory serves, @Ognian uses an external SSD, hence the /dev/sda

mauromorales avatar Jan 02 '24 15:01 mauromorales

@jimmykarily yes I'm booting from an external USB3 disk (SSD); the content of the --efi-dir is only the file extraconfig.txt which extends the rpi hardware config.txt by setting additional HW parameters, overclocking etc... In previous releases the efi dir was populated with it's proper content, so I had to copy only the "additional" things I needed, therefore I'm using only the extraconfig.txt there. I'm using the opensuse:leap standard image but I also get the grub copy error.

Ognian avatar Jan 03 '24 09:01 Ognian

I tried the same command (see above) with this image: quay.io/kairos/opensuse:leap-15.5-standard-arm64-rpi4-v2.4.3-k3sv1.27.6-k3s1 and it still works with no errors.

Our differences are:

  • the contents of the --efi-dir (to be honest, I'm not even sure the files I put there are what is expected)
  • the image itself (You are using a custom build? I'm using the released one)
  • the SSD (I'm using the sdcard and I don't have a disk to try)

Maybe this helps narrow it down.

jimmykarily avatar Jan 03 '24 13:01 jimmykarily

Looking again, I see you had the fdisk -l problem also with the released image. Then the problem must be on the external disk somehow, because the rest of the setup is similar to mine.

jimmykarily avatar Jan 04 '24 06:01 jimmykarily

I've ordered an nvme disk and a USB enclosure, it should be here next week. Unless someone else beats me to it, I will try again as soon as its here.

jimmykarily avatar Jan 04 '24 08:01 jimmykarily

@jimmykarily if I follow your process, I can reproduce the issue when trying to boot from an external SSD

mauromorales avatar Jan 05 '24 09:01 mauromorales

I had no issues booting the resulting images built with earthly for the 2.5.0 via SSD. I wonder if the issue is not related to those efi artifacts?

@jimmykarily @Ognian

mauromorales avatar Jan 15 '24 09:01 mauromorales