ZTS: Use QEMU for tests on Linux and FreeBSD
Motivation and Context
We have the need for more tests on systems != Ubuntu.
Description
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9
- ArchLinux
- CentOS Stream 8, CentOS Stream 9
- Fedora 38, Fedora 39
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 22.04, Ubuntu 24.04
Workflow for each operating system:
- install QEMU on the github runner
- download current cloud image
- start and init that image via cloud-init
- install deps and poweroff system
- start system and build openzfs and then poweroff again
- clone the system and start 3 qemu machines for tests
- use trimable virtual disks (3x 2GB)
- do the functional testings in < 3h
How Has This Been Tested?
This has been tested on my own repo, but more testing is needed....
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Performance enhancement (non-breaking change which improves efficiency)
- [x] Code cleanup (non-breaking change which makes code smaller or more readable)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
- [ ] Documentation (a change to man pages or other documentation)
Checklist:
- [x] My code follows the OpenZFS code style requirements.
- [x] I have updated the documentation accordingly.
- [x] I have read the contributing document.
- [x] I have added tests to cover my changes.
- [x] I have run the ZFS Test Suite with this change applied.
- [x] All commit messages are properly formatted and contain
Signed-off-by.
Most FreeBSD tests will get fixed via starting nfsd+samba I think.
@mcmilk I see you currently have this marked as "Draft". When you think it's ready to be reviewed, please let us know and we can take a look.
Seems ready, I included the FreeBSD src.txz within the FreeBSD cloud image.
But these testings will take some time..... ;-)
Note: I'm actively testing this PR in #16195. Right now I'm running down a bunch of test failures.
Note: I'm actively testing this PR in #16195. Right now I'm running down a bunch of test failures.
I am back from holiday and will also help. I'll investigate the serial console thing first.
It's not final.
The summary isn't ready and some debug things need to be removed.
Can I leave the Ubuntu tests out? Reason: we have 20 actions runners, this PR needs 15:
- 1x for checkstyle
- 1x for CodeQL
- 13x for the different systems
I would like to add some SUSE distribution as well.
Just to make things easier (and not use so many runners), you can exclude the debian* centos-stream* and archlinux runners, since we currently don't support them in buildbot. And when I say exclude, I mean just don't include them in zfs-linux.yml, but keep the rest of the support code you've written (like debian() and archlinux()).
I think it's done now. We can remove the "Status: Work in Progress" badge....
@tonyhutter - What do you think?
@mcmilk that's great news! I'll take a look once all the runners report back.
@mcmilk that's great news! I'll take a look once all the runners report back.
I force pushed again and removed centos-stream-9 and some debugging things within the scripts.
I have seen that you would like to split the tests into fractions like this: 1/3 2/3 ... do you want to add this later or is this just an idea?
I have added FreeBSD 13.3 RELEASE and FreeBSD 14.1 RELEASE to the testings. It would be nice, if we can also add Debian 11 + 12 by default to the tesstings.
I have seen that you would like to split the tests into fractions like this: 1/3 2/3 ... do you want to add this later or is this just an idea?
Correct, right now it's just an idea. I think it might help with some timing-related failures like:
almalinux8: auto_replace_002_pos
Fedora 40: zpool_status_008_pos
I also vaguely remember buildbot giving me issues if I ran with instances that were less than 8GB RAM as well. That's why I'm curious if running 2 VMs with 8GB RAM might make many of this failures go away. I'm starting to get my variable-number-of-VMs code working with 2 VMs in my testing PR (https://github.com/tonyhutter/zfs/pull/1), but I haven't gotten a full run working yet. Once I can get a full run with 2 VMs tested, I wanted to compare it's failures to the remaining failures in this PR. That will help us understand if the failures are timing/underpowered-VM related, or if we need to do some manual fixes to the tests.
Oh no, I forgot the changed zfs-tests.sh script for this pull request :(
Almalinux 8+9, Debian and the FreeBSD 13+14 systems should go green now.
It would be easier - and faster - if the github runners would have 16Gig more RAM. I think the PR is ready now.
@mcmilk I think we might be missing some stderr output on the QEMU builders. For example, here's the same ZTS bug (https://github.com/openzfs/zfs/issues/16439) on both builders:
QEMU:
config.status: executing depfiles commands
config.status: executing libtool commands
config.status: executing po-directories commands
make[2]: Entering directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
GEN gitrev
make all-recursive
make[3]: Entering directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
Making all in include
make[4]: Entering directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/include'
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/include'
Making all in module
make[4]: Entering directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module'
mkdir -p os/linux/spl/
mkdir -p avl/ icp/ icp/algs/aes/ icp/algs/blake3/ icp/algs/edonr/ icp/algs/modes/ icp/algs/sha2/ icp/algs/skein/ icp/api/ icp/asm-aarch64/blake3/ icp/asm-aarch64/sha2/ icp/asm-arm/sha2/ icp/asm-ppc64/blake3/ icp/asm-ppc64/sha2/ icp/asm-x86_64/aes/ icp/asm-x86_64/blake3/ icp/asm-x86_64/modes/ icp/asm-x86_64/sha2/ icp/core/ icp/io/ icp/spi/ lua/ lua/setjmp/ nvpair/ os/linux/zfs/ unicode/ zcommon/ zfs/ zstd/ zstd/lib/common/ zstd/lib/compress/ zstd/lib/decompress/
make -C /usr/src/kernels/6.10.3-100.fc39.x86_64 \
\
M="$PWD" CONFIG_DEBUG_INFO=y CONFIG_ZFS=m modules
make[5]: Entering directory '/usr/src/kernels/6.10.3-100.fc39.x86_64'
make[5]: Leaving directory '/usr/src/kernels/6.10.3-100.fc39.x86_64'
make[4]: Leaving directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module'
make[3]: Leaving directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
make[2]: Leaving directory '/tmp/zfs-build-zfs-yeSNFC5X/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
RPM build warnings:
RPM build errors:
make[1]: Leaving directory '/home/zfs/zfs'
https://github.com/openzfs/zfs/actions/runs/10388084683/job/28762944809
BUILDBOT:
config.status: executing depfiles commands
config.status: executing libtool commands
config.status: executing po-directories commands
+ make -j2
make[2]: Entering directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
GEN gitrev
make all-recursive
make[3]: Entering directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
Making all in include
make[4]: Entering directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/include'
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/include'
Making all in module
make[4]: Entering directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module'
mkdir -p os/linux/spl/
mkdir -p avl/ icp/ icp/algs/aes/ icp/algs/blake3/ icp/algs/edonr/ icp/algs/modes/ icp/algs/sha2/ icp/algs/skein/ icp/api/ icp/asm-aarch64/blake3/ icp/asm-aarch64/sha2/ icp/asm-arm/sha2/ icp/asm-ppc64/blake3/ icp/asm-ppc64/sha2/ icp/asm-x86_64/aes/ icp/asm-x86_64/blake3/ icp/asm-x86_64/modes/ icp/asm-x86_64/sha2/ icp/core/ icp/io/ icp/spi/ lua/ lua/setjmp/ nvpair/ os/linux/zfs/ unicode/ zcommon/ zfs/ zstd/ zstd/lib/common/ zstd/lib/compress/ zstd/lib/decompress/
make -C /usr/src/kernels/6.10.3-100.fc39.x86_64 \
\
M="$PWD" CONFIG_DEBUG_INFO=y CONFIG_ZFS=m modules
make[5]: Entering directory '/usr/src/kernels/6.10.3-100.fc39.x86_64'
make[7]: *** No rule to make target '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module/os/linux/spl/spl-atomic.o', needed by '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module/spl.o'. Stop.
make[7]: *** Waiting for unfinished jobs....
make[6]: *** [/usr/src/kernels/6.10.3-100.fc39.x86_64/Makefile:1946: /tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module] Error 2
make[5]: *** [Makefile:252: __sub-make] Error 2
make[5]: Leaving directory '/usr/src/kernels/6.10.3-100.fc39.x86_64'
make[4]: *** [Makefile:56: modules-Linux] Error 2
make[4]: Leaving directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64/module'
make[3]: *** [Makefile:12324: all-recursive] Error 1
make[3]: Leaving directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
make[2]: *** [Makefile:4652: all] Error 2
make[2]: Leaving directory '/tmp/zfs-build-buildbot-2Wo4V8Y2/BUILD/zfs-kmod-2.2.99/_kmod_build_6.10.3-100.fc39.x86_64'
error: Bad exit status from /tmp/zfs-build-buildbot-2Wo4V8Y2/TMP/rpm-tmp.egsMTM (%build)
RPM build warnings:
source_date_epoch_from_changelog set but %changelog is missing
RPM build errors:
Bad exit status from /tmp/zfs-build-buildbot-2Wo4V8Y2/TMP/rpm-tmp.egsMTM (%build)
make[1]: *** [Makefile:14511: rpm-common] Error 1
make[1]: Leaving directory '/var/lib/buildbot/slaves/zfs/Fedora_39_x86_64__TEST_/build/zfs'
make: *** [Makefile:14445: rpm-kmod] Error 2
https://build.openzfs.org/builders/Fedora%2039%20x86_64%20%28TEST%29/builds/2491/steps/shell_1/logs/make
I fixed these things:
- the stderr messages are sent to the github runner again now
- I rewrote the run() function completly, the return value of some failed
run commandis printed and used later - I also defined a
DEBUG_MAXvariable inqemu-7-summary.sh- so we don't output some really big debug file directly to the browser - rebased to master
An older testrun with failing Fedora 39+40 is here: https://github.com/mcmilk/zfs/actions/runs/10414909636
TODO:
- detect kernel hangs and show them explicit
- maybe restart such vm's and download the logfiles
- increase
DEBUG_MAXto around 400KB
@mcmilk this will take care of the checkstyle issues:
diff --git a/scripts/zfs-tests.sh b/scripts/zfs-tests.sh
index 957e674be..fde2e4acb 100755
--- a/scripts/zfs-tests.sh
+++ b/scripts/zfs-tests.sh
@@ -1,4 +1,4 @@
-#!/usr/bin/env bash
+#!/bin/sh
# shellcheck disable=SC2154
#
# CDDL HEADER START
@@ -215,8 +215,8 @@ find_runfile() {
#
split_tags() {
# Get numerator and denominator
- NUM=$(echo $TAGS | cut -d/ -f1)
- DEN=$(echo $TAGS | cut -d/ -f2)
+ NUM=$(echo "$TAGS" | cut -d/ -f1)
+ DEN=$(echo "$TAGS" | cut -d/ -f2)
# At the point this is called, RUNFILES will contain a comma separated
# list of full paths to the runfiles, like:
#
@@ -242,9 +242,12 @@ split_tags() {
#
# "append,atime,bootfs,cachefile,checksum,cp_files,deadman,dos_attributes, ..."
- cat ${RUNFILES/,/ } | tr -d [],\' | awk '/tags = /{print $NF}' | sort | \
+ # Change the comma to a space for easy processing
+ _RUNFILES="$(echo """$RUNFILES""" | sed 's/,/ /g')"
+ # shellcheck disable=SC2002,SC2086
+ cat $_RUNFILES | tr -d "[],\'" | awk '/tags = /{print $NF}' | sort | \
uniq | grep -v functional | \
- awk -v num=$NUM -v den=$DEN '{ if(NR % den == (num - 1)) {printf "%s,",$0}}' | \
+ awk -v num="$NUM" -v den="$DEN" '{ if(NR % den == (num - 1)) {printf "%s,",$0}}' | \
sed -E 's/,$//'
}
@@ -568,7 +571,7 @@ RUNFILES=${R#,}
#
# "append,atime,bootfs,cachefile,checksum,cp_files,deadman,dos_attributes, ..."
#
-if echo $TAGS | grep -Eq '^[0-9]+/[0-9]+$' ; then
+if echo "$TAGS" | grep -Eq '^[0-9]+/[0-9]+$' ; then
TAGS=$(split_tags)
fi
I am testing zram disks again, it looks that they will speedup the whole thing a lot.
The checkstyle fixups will get included, thank you.