scripts icon indicating copy to clipboard operation
scripts copied to clipboard

[WIP] Enable Rockchip arch in kernel

Open sambonbonne opened this issue 10 months ago • 51 comments

Set CONFIG_ARCH_ROCKCHIP

After some discussion on Matrix about Odroid M1S (based on RK3566), I made this PR to enable Rockchip arch in the kernel.

The goal is to test the generated aarch64 image on my hardware and to see if the initrd size is not increased to much before discussing about the possible inclusion of this configuration in Flatcar.

How to use

Installing the image in Odroid M1S requires U-Boot binaries and multiple steps. I will add testing commands if it works on my hardware and someone want to try it on real hardware.

Testing done

No testing for now (the image does not boot on Odroid M1S).

  • [ ] Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • [ ] Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

sambonbonne avatar Dec 31 '24 15:12 sambonbonne

@sambonbonne the image was built, you can download it and try it from the github actions artifacts page. https://github.com/flatcar/scripts/actions/runs/12575139169?pr=2556

ader1990 avatar Jan 03 '25 10:01 ader1990

The vmlinuz image increase is pretty big, by 3MB, which is more than the space allowed for the updates to work, that s why some of the tests failed. But as a PoC, you can try first the resulting image and see if it works.

ader1990 avatar Jan 03 '25 10:01 ader1990

@ader1990 thank you! I had some problems when trying to edit the partition layout of my SD card with fdisk on thursday (for U-Boot) and I couldn't try to fix it this week-end but I may have time to work on it on Monday or Tuesday. I already know what I can try to fix my partitioning problem so it should not take too long for me to try.

I understand 3MB is too big, unfortunately I don't know enough about "kernel things" to help on this so if I manage to have a working image, I hope we will be able to reduce the added size or find an alternative.

sambonbonne avatar Jan 05 '25 10:01 sambonbonne

@ader1990 I'm trying to use flatcar-install /path/to/flatcar_production_image.bin but I have some errors, IDK if I did something wrong. Here are the logs:

$ flatcar-install -d /dev/sdb -B arm64-usr -i /path/to/ignition.json -f /path/to/flatcar_production_image.bin -u
Using existing image: /path/to/flatcar_production_image.bin
Writing /path/to/flatcar_production_image.bin...
Running in chroot, ignoring request.
Running in chroot, ignoring request.
mount: /tmp/flatcar-install.77HskzhsAm/oemfs: WARNING: source write-protected, mounted read-only.
Installing Ignition config /path/to/ignition.json...
cp: cannot create regular file '/tmp/flatcar-install.77HskzhsAm/oemfs/config.ign': Read-only file system
Error: return code 1 from [[ -n "${IGNITION}" ]]
/dev/sdb: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdb: calling ioctl to re-read partition table: Success

If you have any idea to help me, I will be very happy!

sambonbonne avatar Jan 06 '25 17:01 sambonbonne

@ader1990 I managed to fix my problem and to use flatcar-install with the generated image.

But it seems the first partition, the EFI one, is not mountable when installed so I could not write the boot.scr file. I mounted the partition of the image file to copy the boot.scr before running flatcar-install so I guess it's a workaround but I find it suspicious that the partition is not mountable after dd.

Anyway, the image does not seem to boot so I added CONFIG_ARCH_MULTI_V7 as you recommended but I can't run the pipeline myself. I tried to build using the SDK container but I have some problems when emergeing and I don't have the knowledge to fix it.

Can you launch the pipeline so I can try a new image with the added kernel parameter? Otherwise, should I ask for help with the SDK container (if yes, where? The Matrix channel?)? Thanks in advance!

Just FYI, here is the boot.txt file I use to generate the boot.scr:

load ${devtype} ${devnum}:1 ${kernel_addr_r} /EFI/boot/bootaa64.efi                                           
bootefi ${kernel_addr_r}  

sambonbonne avatar Jan 12 '25 17:01 sambonbonne

@sambonbonne had to solve some conflicts, I did trigger a new build and you should be able to download the image artifact in a few hours, if all goes well.

ader1990 avatar Jan 13 '25 09:01 ader1990

@ader1990 thanks! I hope the new image will boot. I'll give it a try when I'm able to.

sambonbonne avatar Jan 13 '25 15:01 sambonbonne

@ader1990 it seems I cannot set CONFIG_ARCH_MULTI_V7, the pipeline for ARM64 failed with:

ERROR: sys-kernel/coreos-modules-6.12.0::coreos-overlay failed (configure phase):
  Requested options not enabled in build:
    CONFIG_ARCH_MULTI_V7

See https://github.com/flatcar/scripts/actions/runs/12743950372/job/35527535677#step:7:4656.

So I guess I can't enable the CONFIG_ARCH_MULTI_V7 option?

sambonbonne avatar Jan 13 '25 16:01 sambonbonne

@ader1990 it seems I cannot set CONFIG_ARCH_MULTI_V7, the pipeline for ARM64 failed with:

ERROR: sys-kernel/coreos-modules-6.12.0::coreos-overlay failed (configure phase):
  Requested options not enabled in build:
    CONFIG_ARCH_MULTI_V7

See https://github.com/flatcar/scripts/actions/runs/12743950372/job/35527535677#step:7:4656.

So I guess I can't enable the CONFIG_ARCH_MULTI_V7 option?

it seems that the newer kernel 6.12 does not need it anymore. You can push a new change and I can start the build.

For building the kernel properly with Flatcar, I have the following notes for https://www.flatcar.org/docs/latest/reference/developer-guides/sdk-modifying-flatcar/#getting-started:

./build_packages --board arm64-usr

# make sure the tmp is clean
sudo rm -rf /build/arm64-usr/var/tmp/portage/sys-kernel*

# if the kernel sources have been changed
emerge-arm64-usr sys-kernel/coreos-sources

# if the kernel config or patches have changed
emerge-arm64-usr sys-kernel/coreos-modules

# if the bootengine commit id has changed
emerge-arm64-usr sys-kernel/bootengine

# if the bootengine commit id has changed
sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio
emerge-arm64-usr sys-kernel/coreos-kernel

# do a build packages to make sure
./build_packages --board arm64-usr

# follow the official docs
# https://www.flatcar.org/docs/latest/reference/developer-guides/sdk-modifying-flatcar/#getting-started
# do build_image
# do image_to_vm

ader1990 avatar Jan 15 '25 19:01 ader1990

Hello @ader1990 and thanks for those details!

I pushed a commit to remove the CONFIG_ARCH_MULTI_V7. I also got the error when trying to build locally.

Speaking of, I tried to build again after removing this config, so I enter the SDK container with ./run_sdk_container -a arm64 -t and run ./build_package --board arm64-usr but I get another error and I don't understand how it's possible, as I use the SDK container:

sys-kernel/coreos-modules-6.12.0 is missing libraries:
	x86_64: libcrypto.so.3
WARNING build_packages: test_image_content: Failed dependency check
WARNING build_packages: This may be the result of having a long-lived SDK with binary
WARNING build_packages: packages that predate portage 2.2.18. If this is the case try:
    emerge-arm64-usr -agkuDN --rebuilt-binaries=y -j9  @world
    emerge-arm64-usr -a --depclean

I will try to run both emerge commands and rebuild but I you think of something else, feel free to tell me. Edit: I ran both command, both seem to not do anything specific (but succeeded) and the build fails with the same error.

I hope I don't ask for too much with all my questions, to be honest this is my first time building an entire distro, it's challenging and very instructive.

sambonbonne avatar Jan 18 '25 16:01 sambonbonne

What usually happens when trying to run build_packages, is that the error is a little up in the logs, and you might need to tee those logs in a file to better search for the error: ./build_packages --board arm64-usr 2>&1 | tee -a build_packages.log.

When I have errors with the build process, I usually start with a very clean environment from scratch, as there might be leftovers or errors introduced by multiple builds. Being a dockerized environment, it is usually easy to create a new env, just remove the cloned repository, do a docker rm of the dangling containers and images, do a docker system prune for safety (of course, make sure you are not using that env for other work), and start over. Always start with a new cloned repo of flatcar/scripts, otherwise you can do a git reset, git clean -fxd, git rebase on flatcar/scripts main branch, and start the process from step 1: ./run_sdk_container -t.

ader1990 avatar Jan 20 '25 07:01 ader1990

@sambonbonne image was built here: https://github.com/flatcar/scripts/actions/runs/12845716694

ader1990 avatar Jan 20 '25 15:01 ader1990

@ader1990 just wanted to tell you I'm still investigating the boot problem.

I tried multiple boot.txt files, even to directly boot vmlinuz-a (with the bootz command) and even that don't work so I think the problem is on U-boot (the difficult part being: I don't have a UART cable so I can't see U-boot logs).

Beside this, I managed to boot MicroOS with this simple boot.txt (using mkimage to make a boot.scr of course):

btrload ${devtype} ${devnum}:${bootpart} ${ramdisk_addr_r} /boot/grub2/arm64-efi/kernel.img
bootm ${ramdisk_addr_r}  

So I know U-Boot is capable of booting a working OS, I just don't know why it doesn't boot Flatcar.

Edit: I just decided to order a serial cable to see if U-Boot logs message to the UART port, it may take some time to arrive but I still want to work on this.

sambonbonne avatar Jan 26 '25 13:01 sambonbonne

Latest push is due to me rebasing this branch from ader1990/linux_kernel_6_10 and removing the commits which added then removed CONFIG_ARCH_MULTI_V7. No need to rebuild for now.

sambonbonne avatar Jan 26 '25 15:01 sambonbonne

Good news: I got my cable and I already have some things.

Bad news: right now, it's still complicated to find why Flatcar doesn't boot.

I can see the Grub menu through the serial port but after booting Flatcar, even by adding a debug parameter, all I get is:

Booting a command list

EFI stub: Booting Linux Kernel...
EFI stub: EFI_RNG_PROTOCOL unavailable
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...

I also tried to boot the A partition directly (still from Grub menu) with the debug parameter, nothing more than the above logs.

I think I miss a kernel configuration, I'm looking for it. Fortunately, Home Assistant provides a working image for Odroid M1S so I will try to find out what they use. I hope I'll be able to build locally this time, this would avoid multiple pipelines just to find out missing parameters. I guess this will increase the initrd size but let's find out the config, then we'll see.

sambonbonne avatar Feb 02 '25 14:02 sambonbonne

I see that the support has been there since the 6.12 https://github.com/torvalds/linux/commit/10dc64fe0f980c47c7e747885ddf7a8c12780337, so it should work. For the debug logs, I might suggest adding more kernel config params - mainly console=ttyAMA0 console=ttyAMA1 console=ttyS0 console=ttyS1, maybe you can get some more information. Also, it would be helpful to share a kernel config file from an image that works, to cross-compare and see what might be missing.

Thanks.

ader1990 avatar Feb 12 '25 09:02 ader1990

My comment is a bit long so I tried to structure it two parts.

Debug

I tried different parameters for debug logs: the four console=… you gave and console=tty0 (this last was tried because I found it in the boot partition of the HAOS image): I get nothing. I tried in the default entry and in the A partition entry.

Maybe I do it wrong but it's not my first time adding a temporary kernel parameter from Grub: I use cu as a serial console, I some Odroid boot information, then I have the Grub menu. I edit the first entry, append the parameter at the end of the line starting with linux and ending with $linux_cmdline and hit Ctrl-x to boot.

Kernel configs

As a theoretically working kernel config, I have two sources for Odroid M1S:

  • Gentoo wiki: https://wiki.gentoo.org/wiki/Hardkernel_ODROID-M1S#Kernel_building
  • Home Assistant: https://www.home-assistant.io/installation/odroid/#flashing-home-assistant-m1s

Gentoo wiki configs try

I tried to add configs from Gentoo wiki but that's where I got some build error when running build_packages from my machine. I pulled my branch since you rebased it on main (thanks for that) but now I have a new build error, from podman, the podman's build log (in /build/arm64-usr/var/log/portage/app-containers:podman-5.3.0:20250212-102732.log) just ends with this:

/build/arm64-usr/var/log/portage/app-containers:podman-5.3.0:20250212-102732.log
cd .                                                                                                                                                                                         
GOROOT='/usr/lib/go' /usr/lib/go/pkg/tool/linux_amd64/link -o $WORK/b001/exe/a.out -importcfg $WORK/b001/importcfg.link -installsuffix shared -X=runtime.godebugDefault=asynctimerchan=1,goty
pesalias=0,httpservecontentkeepheaders=1,tls3des=1,tlskyber=0,x509keypairleaf=0,x509negativeserial=1 -buildmode=pie -buildid=Di7VbJOVEz0cCecPSPVh/lIE-3YL_RKIYFm-tCLNS/g5oi0ADpMmXJOtBy8UrJ/D
i7VbJOVEz0cCecPSPVh -X github.com/containers/podman/v5/libpod/define.buildInfo=1739356126 -X github.com/containers/podman/v5/libpod/config._installPrefix=/usr -X github.com/containers/podma
n/v5/libpod/config._etcDir=/etc -X github.com/containers/podman/v5/pkg/systemd/quadlet._binDir=/usr/bin -X github.com/containers/common/pkg/config.additionalHelperBinariesDir= -extld=aarch6
4-cros-linux-gnu-gcc $WORK/b001/_pkg_.a                                                                                                                                                      
/usr/lib/go/pkg/tool/linux_amd64/buildid -w $WORK/b001/exe/a.out # internal                                                                                                                  
mkdir -p bin/                                                                                                                                                                                
cp $WORK/b001/exe/a.out bin/podman-testing                                                                                                                                                   
/usr/lib/go/pkg/tool/linux_amd64/buildid -w $WORK/b001/exe/a.out # internal                                                                                                                  
mkdir -p bin/                                                                                                                                                                                
cp $WORK/b001/exe/a.out bin/podman                                                                                                                                                           
test -z "" || chcon -t container_runtime_exec_t bin/podman                                                                                                                                      

I built from a fresh environment (new clone and remove all Flatcar containers and images) so I don't understand how I can still face this kind of errors (I run the ./build_package --board arm64-usr from the SDK container, run with ./run_sdk_container -a arm64 -t).

Is it possible to build a smaller set just to try the boot, without Podman for example?

Home Assistant OS info

For HAOSS (short for Home Assistant OS), it bootloops when installed on SD card so I may have to try to install it directly on eMMC and maybe I can find a way to copy the /proc/config.gz to check if there is useful kernel parameters. But even if I find other kernel parameters to add, I may face build errors.

sambonbonne avatar Feb 12 '25 14:02 sambonbonne

I managed to build with more parameters but still no luck. I created a PR on my repo for that and started a self-hosted runner with the required labels.

Build working but image still not booting: https://github.com/sambonbonne/flatcar-scripts/pull/2/commits/1eb4d0b96552bfd1c0eb28fa9da5af206e20a7aa

Build failing: https://github.com/sambonbonne/flatcar-scripts/pull/2/commits/98c0da99ea7de05a773b806d9d5389eb1e127c6a (build log: https://github.com/sambonbonne/flatcar-scripts/actions/runs/13437246231/job/37542450165)

My next try, when I have the time, will be to run the HAOSS image directly in eMMC (instead of SD) and see if the kernel config is available. :crossed_fingers:

sambonbonne avatar Feb 20 '25 15:02 sambonbonne

Just to let you know that we do want to support this eventually. I'm hard at work on addressing the lack of space in /boot, which we need to fix before we can enable Rockchip support.

chewi avatar Feb 20 '25 15:02 chewi

@chewi I'm happy to know you want to support this when it's possible, I would have been very frustrated if I did this die-and-retry process for nothing :grin:

But I understand the priority is on /boot space, Adrian explained the problem and I perfectly understand. This let me some time to try different things in this PR.

sambonbonne avatar Feb 20 '25 16:02 sambonbonne

Good news: I managed to boot HAOS, dump the /proc/config.gz and, of course, gunzip it. The text file is 10760 lines long (with comments and spacing but still, I guess this is the expected amount of lines for a Linux kernel).

Bad news: they are on kernel 6.6.73, so I guess they use some patches to make the Odroid M1S work.

I'll try to build my branch with their config file anyway in a few days.

Does anyone know if there is a way to check if a config has been renamed between two kernel versions (especially from 6.6 to 6.12)?

sambonbonne avatar Feb 20 '25 16:02 sambonbonne

Do you mean config options changing name? I have a git clone of the kernel, so I'd check the history, but make oldconfig is very useful too. It'll ask about options that aren't in your config.

chewi avatar Feb 20 '25 16:02 chewi

No CI required, I just rebased my branch so this will still produce a non-working image.

I didn't work on this for three weeks but I'll try to find more time soon.

sambonbonne avatar Mar 16 '25 13:03 sambonbonne

So, today I discovered earlycon and keep_bootcon Linux command line parameters (yeah, I'm that noob) and it helped a lot.

First the kernel stops with a clk: Disabling unused clocks problem, I guess this may come from my DTB file but I'm not sure. I can't manage to boot without passing a DTB file so I guess I have to do something about this but as a workaround, I can pass a clk_ignore_unused parameter for now. Or find a way to boot without DTB.

Then I got FATAL: iscsiroot requested but kernel/initrd does not support iscsi, followed by Refusing to continue then shutdown.

I see CONFIG_ISCSI_IBFT=y and CONFIG_ISCSI_IBFT_FIND=y in amd64_defconfig-6.12 so right now I'm building Flatcar again after adding the same configs in arm64_defconfig-6.12 to see if it boots.

I hope I'll be able to fix the clocks problem but what a relief to get to that point!

sambonbonne avatar May 06 '25 16:05 sambonbonne

Ah, good idea, should have thought of that. Those messages might be red herrings though. iscsi support certainly isn't mandatory.

chewi avatar May 07 '25 08:05 chewi

Well, added configs did not change anything but by reading more carefully, I found this in the logs:

systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'. systemd[1]: Failed to start systemd-modules-load.service - Load Kernel Modules.

It happens after the clock problem (when ignored with clk_ignore_unused) but just before the ISCSI problem, which make me think the ISCSI fails is just the consequence of not being able of loading kernel modules.

@chewi should I wait for the new Dracut version before retrying or it won't change anything? Because the ISCSI failure logs come from Dracut but as the systemd-modules-load.service fails in the first hand, I think it's more a systemd problem than a Dracut problem (but as I don't know a lot about kernel/systemd/dracut, I would be happy if the answer is "it's Dracut").

sambonbonne avatar May 07 '25 12:05 sambonbonne

Ah, so you are getting further than before. I thought it was still freezing before it even got to the initrd. Are you not able to get an emergency shell at this point? Then you could find out exactly why systemd-modules-load failed. Hard to guess why. Could be certain modules missing, the whole module directory being wrong, or some module just failing to load.

I doubt the new Dracut will help here. If you're building from recent master, then you should have it already anyway. Otherwise, it will be in the next Alpha, which should be out very soon.

chewi avatar May 07 '25 12:05 chewi

Ah, so you are getting further than before. I thought it was still freezing before it even got to the initrd.

I'm getting further when I boot the vmlinuz-a image directly, but with Grub I still got no luck unfortunately. So right now I do my tests by booting the vmlinuz-a image directly.

Are you not able to get an emergency shell at this point?

How can I try? It fails and stop but maybe I could try some kernel parameter?

I'm not sure it would work because everything is printed thanks to earlyprintk and keep_bootcon parameters so I'm not sure I could get something interactive. If I remove the keep_bootcon, I got not console.

If you're building from recent master, …

Right now, I'm still based on the 6.12 kernel branch (as Odroid M1S mainline support has been added in Linux 6.12 only) and I didn't rebase for a long time. I wanted to wait for the 6.12 merge before rebasing on it on a more regular basis.

sambonbonne avatar May 07 '25 13:05 sambonbonne

How can I try? It fails and stop but maybe I could try some kernel parameter?

I'm not sure it would work because everything is printed thanks to earlyprintk and keep_bootcon parameters so I'm not sure I could get something interactive. If I remove the keep_bootcon, I got not console.

Normally, it should either finish booting and reach the bash prompt or drop you to an emergency shell. You might not be seeing the latter. Then again, you might not be seeing the former either. Maybe it's actually booting? You can force an emergency shell with the rd.break. There are various points you can tell it to break at. See man dracut.cmdline.

Right now, I'm still based on the 6.12 kernel branch (as Odroid M1S mainline support has been added in Linux 6.12 only) and I didn't rebase for a long time. I wanted to wait for the 6.12 merge before rebasing on it on a more regular basis.

Ah yes, 6.12 isn't quite merged yet. Possibly waiting on feedback from, I'll get on that. The branch (currently f0479a54a608a19ee13b3bf93bd962916bc6afbd) does include the new Dracut though.

chewi avatar May 07 '25 13:05 chewi