build icon indicating copy to clipboard operation
build copied to clipboard

The image is built with the kernel from previous builds.

Open ilyakurdyukov opened this issue 1 year ago • 9 comments

What happened?

I worked with this commit from October 11th, did not update yet. But I didn't notice any fixes for this problem in the updates.

I rebuild the Armbian image with changes to the kernel config (options for compile.sh are the same), and at first the build script rebuilds the kernel with the config specified (which I guessed from the object files remaining after the build), but the output image contains one of the kernels that I built before (with a different config).

How to reproduce?

Haven't tried to reproduce it on purpose yet (it may take hours), but it's happened at least twice already. Unfortunately, the old cache was lost, deleted it to get the correct build. I need advice on what to look (to report) for if this happens again. And where could this bug be. When building an image, does the build choose one of the old kernels from cache instead of the new one?

Branch

main (main development branch)

On which host OS are you observing this problem?

Jammy

Relevant log URL

No response

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

ilyakurdyukov avatar Oct 30 '23 04:10 ilyakurdyukov

Jira ticket: AR-1903

github-actions[bot] avatar Oct 30 '23 04:10 github-actions[bot]

I rebuilt it with a changed kernel config, and as a result I got an image with yesterday's config (which is different from today's). So now I have caches when the bug happened.

ilyakurdyukov avatar Oct 30 '23 05:10 ilyakurdyukov

output/config/linux-rk35xx-legacy.config - is today's config. output/packages-hashed/kernel-rk35xx-legacy_5.10.160-S0d05-D0efe-P0000-C999999Hfe66-HK01ba-Vc222-Bf115-R448a_arm64.tar - contains today's kernel. output/packages-hashed/global/linux-image-legacy-rk35xx_5.10.160-S0d05-D0efe-P0000-C999999Hfe66-HK01ba-Vc222-Bf115-R448a_arm64.deb - also today's kernel.

output/images/Armbian_23.11.0-trunk_Hinlink-h28k_jammy_legacy_5.10.160_mate_desktop.img - but the newly built image contains yesterday's kernel and kernel config. output/debs/linux-image-legacy-rk35xx_23.11.0-trunk_arm64__5.10.160-S0d05-D0efe-P0000-C999999Hfe66-HK01ba-Vc222-Bf115-R448a.deb - this is modified today, but contains yesterday's kernel and kernel config. The files inside are dated yesterday.

Is output/packages-hashed/kernel-<...>.tar supposed to be packaged into output/debs, but yesterday's files are used instead?

ilyakurdyukov avatar Oct 30 '23 05:10 ilyakurdyukov

cache/rootfs/rootfs-arm64-jammy-mate-desktop_202310-0189c6ab0bb4-H01ba47-B3ce5fd.tar.zst - is dated yesterday, but perhaps this is normal if only the kernel has changed?

I'm guessing that to reproduce you need:

  1. I don't think the options matter:

$ ./compile.sh build BOARD=hinlink-h28k BRANCH=legacy BUILD_DESKTOP=yes BUILD_MINIMAL=no DESKTOP_ENVIRONMENT=mate DESKTOP_ENVIRONMENT_CONFIG_NAME=config_base EXPERT=yes KERNEL_CONFIGURE=yes KERNEL_GIT=shallow RELEASE=jammy (don't select additional packages)

  1. Change something in the config for kernel you just build and remember, the config for the kernel that I use - config/kernel/linux-rk35xx-legacy.config

  2. Move the result of the first build somewhere (output/Armbian_*).

  3. Rebuild the image again:

$ ./compile.sh build BOARD=hinlink-h28k BRANCH=legacy BUILD_DESKTOP=yes BUILD_MINIMAL=no DESKTOP_ENVIRONMENT=mate DESKTOP_ENVIRONMENT_CONFIG_NAME=config_base EXPERT=yes KERNEL_CONFIGURE=yes KERNEL_GIT=shallow RELEASE=jammy (don't select additional packages)

ilyakurdyukov avatar Oct 30 '23 05:10 ilyakurdyukov

--> (679) COMMAND: mkdir -p /mnt/build/build/output/debs
--> (679) COMMAND: touch /mnt/build/build/output/debs/linux-image-legacy-rk35xx_23.11.0-trunk_arm64__5.10.160-S0d05-D0efe-P0000-C999999Hfe66-HK01ba-Vc222-Bf115-R448a.deb

Did this code in artifacts-reversion.sh change the time of output/debs/linux-image-legacy<...>.deb because the script thinks the kernel is the same? And then this .deb (containing the old kernel) used to build the image?

		# since the full versioned path includes the original hash, if the file already exists, we can trust
		# it's the correct one, and skip reversioning.
		# do touch the file, so mtime reflects it is wanted, and later delete old files to keep junk under control
		if [[ -f "${deb_versioned_full_path}" ]]; then
			display_alert "Skipping reversioning" "deb: ${deb_versioned_full_path} already exists" "debug"
			run_host_command_logged touch "${deb_versioned_full_path}"
			continue
		fi

ilyakurdyukov avatar Oct 30 '23 09:10 ilyakurdyukov

Maybe this problem appeared because config/kernel/linux-rk35xx-legacy.config.defconfig exists? Which I didn't change, but is it used to calculate the hash?

ilyakurdyukov avatar Oct 30 '23 10:10 ilyakurdyukov

After removing config/kernel/linux-rk35xx-legacy.config.defconfig this erroneous behavior seems to have ended, after changing the config and rebuilding I got the correct kernel in the image, not the old one.

Something is wrong with the scripts. Why is the kernel even built using .config and only after that artifacts-reversion.sh think it's the same thing and it didn't use the compilation results?

ilyakurdyukov avatar Oct 30 '23 10:10 ilyakurdyukov

@ilyakurdyukov: read the warnings produced by the build, they help (sometimes). In this case: https://github.com/armbian/build/blob/main/lib/functions/main/default-build.sh#L14-L17

You probably should drop KERNEL_CONFIGURE=yes, which is kept only for old-timer's convenience/insistence. (I really intend to completely remove it). Changing the kernel configuration during the image build causes inconsistencies (a part of the kernel hash will be fixed as 999999). Use the (separate) cli command kernel-config to change your config, commit your changes, then build.

The .defconfig is only ever produced when you run the kernel config (either through the cli command kernel-config or KERNEL_CONFIGURE=yes) -- it is never read or handled in any way by the build system.

Sorry you had to endure this, this is a bit from trying to be everything to everyone.

rpardini avatar Oct 31 '23 12:10 rpardini

Yes, I noticed this warning earlier ("It still works, but please prefer the new way."), but it says that it still works, so I didn’t change anything. Perhaps this warning should be made more threatening.

ilyakurdyukov avatar Oct 31 '23 12:10 ilyakurdyukov

@EvilOlaf @igorpecovnik for awareness. Can I remove KERNEL_CONFIGURE=yes before we cement it into eternity with new documentation? It's been a year+

rpardini avatar Mar 02 '24 17:03 rpardini

From user perspective I'd say no. From developer's perspective yes. Difficult call.

EvilOlaf avatar Mar 03 '24 05:03 EvilOlaf