DTS build is not reproducible
The problem you're addressing (if any)
DTS build is not reproducible, i.e. each new build on the same machine & system & commit creates image with different hash.
Currently even last build step results in different hashes:
kas-container build meta-dts/kas.yml
sha256sum build/tmp/deploy/images/genericx86_64/dts-base-image-genericx86-64.wic.gz > 1.sha256
kas-container shell meta-dts/kas.yml -c "bitbake -c clean dts-base-image"
kas-container build meta-dts/kas.yml
sha256sum build/tmp/deploy/images/genericx86_64/dts-base-image-genericx86-64.wic.gz > 2.sha256
diff 1.sha256 2.sha256
1c1
< c8da775720687d2928bb79d05ab6982ff42e1cb3024c1f6884880955414294e6 dts-base-image-genericx86-64.wic.gz
---
> 490c28fe71c02a89c2c320a0436911f7ed18606924dc083010726f836b3adb96 dts-base-image-genericx86-64.wic.gz
Describe the solution you'd like
Built image should be identical when built from the same commit.
Where is the value to a user, and who might that user be?
No response
Describe alternatives you've considered
No response
Additional context
https://docs.yoctoproject.org/test-manual/reproducible-builds.html https://wiki.yoctoproject.org/wiki/Reproducible_Builds https://reproducible-builds.org/
To check for differences you could compare sha256 sums of all rpms (build/tmp/deploy/rmp).
Another way is to compare spdx files. e.g. by using sbomdiff with this commit: https://github.com/iwanicki92/sbomdiff/commit/64c002d93fb89746e5b52d81863b55d60139c9d5
And e.g. script to check all spdx files:
#!/usr/bin/env bash
shopt -s nullglob
get_percent_progress() {
i=$1
total=$2
if [ "$total" = "0" ]; then
echo "100"
return 0
fi
echo $(((i*100)/total))
}
print_progress() {
progress=$1
arch=$2
clear_line="\033[0K"
echo -ne "\rProgress($arch): ${progress}%${clear_line}"
}
if [[ $# -ne 2 ]]; then
echo "./compare_spdx.sh <path to spdxA> <path to spdxB>"
exit 1
fi
spdxA=$1
spdxB=$2
architectures=( all allarch core2-64 genericx86_64 x86_64 )
for arch in "${architectures[@]}"; do
arch_total="$(ls -1 "$spdxA/$arch/packages" 2>/dev/null | wc -l)"
progress=0
print_progress $(get_percent_progress 0 $arch_total) $arch
for package in "$spdxA/$arch/packages/"*; do
package=$(basename "$package")
i=$((i+1))
new_progress=$(get_percent_progress $i $arch_total)
if [ "$new_progress" -ne "$progress" ]; then
progress=$new_progress
print_progress $progress $arch
fi
if ! differences=$(trap '' INT; sbomdiff --exclude-license --sbom spdx {"$spdxA","$spdxB"}/"$arch/packages/$package"); then
echo ""
echo "Difference in: $arch/packages/$package"
echo "$differences"
fi
done
done
./sbom.sh build/tmp/deploy/spdx build.old/tmp/deploy/spdx
Progress(allarch): 54%
Difference in: allarch/packages/os-release.spdx.json
[CHANGED] usr/lib/os-release
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Progress(core2-64): 1%
Difference in: core2-64/packages/3mdeb-secpack.spdx.json
[CHANGED] root/.dasharo-gnupg/pubring.kbx
[CHANGED] root/.dasharo-gnupg/trustdb.gpg
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 2
Removed files: 0
New files: 0
Progress(core2-64): 3%^C
Though I don't think spdx will show metadata changes (such as attributes in resulting image/rpm).
Though I'm unsure whether those methods will show any changes at all. It's possible that differences are only in resulting .wic image e.g. in filesystem itself so we would need to compare either binary differences (e.g. vimdiff <(xxd dts-1.wic) <(xxd dts-2.wic)), create some human readable dump of all filesystem information (FAT tables, block groups, etc) and diff that.
I did a little research:
-
SOURCE_DATE_EPOCHdoes not change across builds -
bitbake dts-base-image -S printdiffprints nothing
I will try the suggestions from the comment above.
sbomdiff shows no differences between spdx files across builds:
./sbomdiff-all.sh spdx-a spdx-b
Progress(x86_64): 100%
@PLangowski did you build second one without cache (and after removing old build directory)?
I'm pretty sure you'll have to analyze wic image, filesystems, or maybe even . Maybe even compare gzip metadata in wic.gz to see if it doesn't embed any non-reproducible information (like timestamps).
You might also might want to try binary diff e.g.:
diff -u1 <(sudo xxd -l 200 /dev/loop1p1) <(sudo xxd -l 200 /dev/loop2p1)
--- /dev/fd/63 2025-02-03 15:00:11.020622502 +0100
+++ /dev/fd/62 2025-02-03 15:00:11.020622502 +0100
@@ -2,3 +2,3 @@
00000010: 0200 0220 f1f8 4000 2000 0400 0000 0000 ... ..@. .......
-00000020: 0000 0000 8000 29d9 16a6 bf64 7473 2d62 ......)....dts-b
+00000020: 0000 0000 8000 2947 c98d a164 7473 2d62 ......)G...dts-b
00000030: 6f6f 7420 2020 4641 5431 3620 2020 0e1f oot FAT16 ..
Or interactive:
nvim -d <(sudo xxd -l 200 /dev/loop1p1) <(sudo xxd -l 200 /dev/loop2p1)
This diff is between release 2.2.0 and image built from main branch though it's not clean build so it might be a reason for difference.
I used diffoscope to compare binaries from two clean, cacheless builds. There are some differences in timestamps and UUIDs.
diffoscope dts-1.img dts-2.img --html diff.html
To deal with UUID we would probably need to generate one with seed based on e.g. commit or just hardcode one value. From what I can remember at leastd PARTUUID changes. It's posible that UUID also is randomly generated.
https://github.com/Dasharo/meta-dts/blob/16d4fa0b05b042870286777d5b0f4f62c56f5c1b/meta-dts-distro/wic/usb-stick-dts.wks.in#L3
https://docs.yoctoproject.org/3.3.5/ref-manual/kickstart.html#command-part-or-partition:
--use-uuid: This option is a Wic-specific option that causes Wic to generate a random GUID for the partition. The generated identifier is used in the bootloader configuration to specify the root partition.
--uuid: This option is a Wic-specific option that specifies the partition UUID.
--fsuuid: This option is a Wic-specific option that specifies the filesystem UUID. You can generate or modify WKS_FILE with this option if a preconfigured filesystem UUID is added to the kernel command line in the bootloader configuration before you run Wic.
Serial number in diffoscope output might be PARTUUID.
You should probably make checklist here with what needs to be fixed for reproducible build and then fix it one by one.
@PLangowski could you post spdx for core2-64/packages/3mdeb-secpack.spdx.json?
Based on diffoscope output ./root/.dasharo-gnupg/trustdb.gpg are different between builds so ./sbomdiff-all.sh spdx-a spdx-b should've detected different hashes.
Here is the diff:
diffoscope 1-3mdeb-secpack.spdx.json 2-3mdeb-secpack.spdx.json
--- 1-3mdeb-secpack.spdx.json
+++ 2-3mdeb-secpack.spdx.json
├── Pretty-printed
│ @@ -1,42 +1,42 @@
│ {
│ "SPDXID": "SPDXRef-DOCUMENT",
│ "creationInfo": {
│ "comment": "This document was created by analyzing packages created during the build.",
│ - "created": "2025-02-13T08:16:51Z",
│ + "created": "2025-02-13T09:17:06Z",
│ "creators": [
│ "Tool: OpenEmbedded Core create-spdx.bbclass",
│ "Organization: OpenEmbedded ()",
│ "Person: N/A ()"
│ ],
│ "licenseListVersion": "3.14"
│ },
│ "dataLicense": "CC0-1.0",
│ "documentNamespace": "http://spdx.org/spdxdoc/3mdeb-secpack-3039534a-88e0-5917-a508-6111bad28668",
│ "externalDocumentRefs": [
│ {
│ "checksum": {
│ "algorithm": "SHA1",
│ - "checksumValue": "8b56e4aa2ce061d81966b37a93c7911c6742d909"
│ + "checksumValue": "5eb1ef630de68cdc6cee114aed4ba6fac5fb2150"
│ },
│ "externalDocumentId": "DocumentRef-recipe-3mdeb-secpack",
│ "spdxDocument": "http://spdx.org/spdxdoc/recipe-3mdeb-secpack-6d87dc6d-1020-5ab7-a332-20efb1e896a9"
│ }
│ ],
│ "files": [
│ {
│ "SPDXID": "SPDXRef-PackagedFile-3mdeb-secpack-1",
│ "checksums": [
│ {
│ "algorithm": "SHA1",
│ - "checksumValue": "1c1413cd26b66e5c56375f9dfc526ea8f874d5fe"
│ + "checksumValue": "22298b0cb9bdf5930f3cd2972cdd5bb3634bc154"
│ },
│ {
│ "algorithm": "SHA256",
│ - "checksumValue": "38ac26594155c2dbeed284369d47fe99b41f696069c35aec2aac4cba542c0830"
│ + "checksumValue": "3fc28075520e396042de41ff0c0aa58179bf0c96275ff6e4c3050222381afbca"
│ }
│ ],
│ "copyrightText": "NOASSERTION",
│ "fileName": "root/.dasharo-gnupg/pubring.kbx",
│ "fileTypes": [
│ "BINARY"
│ ],
│ @@ -46,19 +46,19 @@
│ ]
│ },
│ {
│ "SPDXID": "SPDXRef-PackagedFile-3mdeb-secpack-2",
│ "checksums": [
│ {
│ "algorithm": "SHA1",
│ - "checksumValue": "4df07ae72843524269ab21960f3ae62850f628c9"
│ + "checksumValue": "e0bd9428f8af9f39368fc93e73b3329d545ff10e"
│ },
│ {
│ "algorithm": "SHA256",
│ - "checksumValue": "e7b31ee9fbc9c91a8de73cb3ff099fbdfbf96cef2d546a97b1813a4e8df440aa"
│ + "checksumValue": "4b6aa35e1c743e14a5eb2df9bfb39e04c2de4394c7047f1a63ff09f2d1832f32"
│ }
│ ],
│ "copyrightText": "NOASSERTION",
│ "fileName": "root/.dasharo-gnupg/trustdb.gpg",
│ "fileTypes": [
│ "BINARY"
│ ],
│ @@ -104,15 +104,15 @@
│ "licenseConcluded": "NOASSERTION",
│ "licenseDeclared": "MIT",
│ "licenseInfoFromFiles": [
│ "NOASSERTION"
│ ],
│ "name": "3mdeb-secpack",
│ "packageVerificationCode": {
│ - "packageVerificationCodeValue": "7425be22a595f9ea5780634d55ff23f2850eb1c8"
│ + "packageVerificationCodeValue": "052c2f845115f44570bcc593c53dc029b632ccfc"
│ },
│ "supplier": "Organization: OpenEmbedded ()",
│ "versionInfo": "1.0+git"
│ }
│ ],
│ "relationships": [
│ {
@PLangowski
sbomdiff shows no differences between spdx files across builds
How did you test it? Did you make sure to use commit with my changes? Because it works for me (at least for this one file)
Yes, I used your commit
- [ ] Timestamp differences
- [ ] UUIDs/build ID/other IDs
- [ ] Missing files across builds
- [ ] Other diffs in binary files
I reinstalled sbomdiff and this time differences do appear:
./sbomdiff-all.sh spdx-1 spdx-2
Progress(core2-64): 1%
Difference in: core2-64/packages/3mdeb-secpack.spdx.json
[CHANGED] root/.dasharo-gnupg/pubring.kbx
[CHANGED] root/.dasharo-gnupg/trustdb.gpg
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 2
Removed files: 0
New files: 0
Progress(core2-64): 7%
Difference in: core2-64/packages/cpuid-doc.spdx.json
[CHANGED] usr/share/man/man1/cpuinfo2cpuid.1.gz
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Progress(core2-64): 13%
Difference in: core2-64/packages/futil-dbg.spdx.json
[CHANGED] usr/bin/.debug/futility
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Difference in: core2-64/packages/futil.spdx.json
[CHANGED] usr/bin/futility
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Difference in: core2-64/packages/futil-src.spdx.json
[CHANGED] usr/src/debug/futil/1.0+git/build/gen/futility_cmds.c
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Progress(core2-64): 70%
Difference in: core2-64/packages/lshw.spdx.json
[CHANGED] usr/share/lshw/usb.ids.gz
[CHANGED] usr/share/lshw/pnp.ids.gz
[CHANGED] usr/share/lshw/oui.txt.gz
[CHANGED] usr/share/lshw/pci.ids.gz
[CHANGED] usr/share/lshw/pnpid.txt.gz
[CHANGED] usr/share/lshw/manuf.txt.gz
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 6
Removed files: 0
New files: 0
Progress(core2-64): 71%
Difference in: core2-64/packages/minio-cli-dbg.spdx.json
[CHANGED] usr/bin/.debug/mc
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Difference in: core2-64/packages/minio-cli.spdx.json
[CHANGED] usr/bin/mc
Summary
-------
Version changes: 0
Removed packages: 0
New packages: 0
Changed files: 1
Removed files: 0
New files: 0
Progress(x86_64): 100%
Also, I tried fixing the timestamp differences in the lshw recipe:
do_configure:prepend() {
export SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH}
}
because I thought that maybe this variable needed to be exported but it did not help.
@PLangowski I'm pretty sure SOURCE_DATE_EPOCH is already exported in all recipes (unless it's overwritten).
Changes in .debug files could be fixed by not including .debug packages (I think those are generated automatically: ${PN}-debug) if we don't need them (and we probably don't)
Of course it's possible we don't include those packages. It's likely that spdx is created for all packages even those that are not installed in default image
About lshw all differences are in gz files so you need to check how to make gz creation reproducible and possibly modify recipe or even source code.
I will try adding EXTRA_OEMAKE += "GZIP='gzip -9 -n'" to lshw so that gzip does not add timestamps
That would likely work. Another way would be to make sure that files that are compressed always had the same last modification date (as that's where this timestamp is taken from).
- Is it something to be fixed in Yocto upstream layers?
- Does it matter that much to have
reproducible compressionif the.wicimage inside is reproducible? I admit I have not gone through the whole discussion, just chimed in for a moment.
@macpijan it's not wic.gz that's not reproducible, it's gz files inside rootfs created by lshw (and I think at least one other) recipe.
I introduced the following patch:
src/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/Makefile b/src/Makefile
index ac726d0..cb9386e 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -67,7 +67,7 @@ all: $(PACKAGENAME) $(PACKAGENAME).1 $(DATAFILES)
$(CXX) $(CXXFLAGS) -c $< -o $@
%.gz: %
- $(GZIP) -c $< > $@
+ $(GZIP) -n -c $< > $@
.PHONY: core
core:
--
2.47.0
And the files from lshw no longer differ