WIP: ZTS KASAN NixOS integration test
Continuing from @behlendorf's comment here - this PR integrates a NixOS based integration test to automate ZTS on a KASAN-enabled kernel based on the NixOS kernel.
Description
This is an attempt to automate kernel building + VM creation for #12226.
The derivation package definition is based off the nixpkgs ZFS package definition and interestingly the NixOS default kernel definition already ships with DEBUG enabled.
Curent status:
- ZTS runs but with failures without KASAN enabled.
- With KASAN enabled, the module immediately fails after loading with:
vm-test-run-zts> machine # [ 52.343124] zfs: module license 'CDDL' taints kernel.
vm-test-run-zts> machine # [ 52.344863] Disabling lock debugging due to kernel taint
vm-test-run-zts> machine # [ 52.346943] zfs: module license taints kernel.
vm-test-run-zts> machine # [ 53.971981] systemd[1]: Condition check resulted in /dev/hvc0 being skipped.
vm-test-run-zts> machine # [ 54.264868] systemd[1]: Condition check resulted in /dev/ttyS0 being skipped.
vm-test-run-zts> machine # [ 54.951168] __kmem_cache_create_args(zio_buf_512) failed with error -22
vm-test-run-zts> machine # [ 54.953143] CPU: 3 UID: 0 PID: 365 Comm: systemd-modules Tainted: P O 6.12.57 #1-NixOS
vm-test-run-zts> machine # [ 54.953153] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
vm-test-run-zts> machine # [ 54.953156] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
vm-test-run-zts> machine # [ 54.953160] Call Trace:
vm-test-run-zts> machine # [ 54.953163] <TASK>
vm-test-run-zts> machine # [ 54.953167] dump_stack_lvl+0x5d/0x80
vm-test-run-zts> machine # [ 54.953178] __kmem_cache_create_args.cold+0x21/0x47
vm-test-run-zts> machine # [ 54.953188] kmem_cache_create_usercopy.constprop.0+0x45/0x70 [spl]
vm-test-run-zts> machine # [ 54.953217] spl_kmem_cache_create+0x2a2/0x450 [spl]
vm-test-run-zts> machine # [ 54.953243] zio_init+0x326/0x3b0 [zfs]
vm-test-run-zts> machine # [ 54.954364] spa_init+0x137/0x190 [zfs]
vm-test-run-zts> machine # [ 54.954812] zfs_kmod_init+0x30/0xe0 [zfs]
vm-test-run-zts> machine # [ 54.955245] openzfs_init_os+0xf/0xa0 [zfs]
vm-test-run-zts> machine # [ 54.955897] openzfs_init+0x34/0xd00 [zfs]
vm-test-run-zts> machine # [ 54.956322] ? __pfx_openzfs_init+0x10/0x10 [zfs]
vm-test-run-zts> machine # [ 54.956745] do_one_initcall+0xa7/0x380
vm-test-run-zts> machine # [ 54.956755] ? __pfx_do_one_initcall+0x10/0x10
vm-test-run-zts> machine # [ 54.956763] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956774] do_init_module+0x2ec/0x860
vm-test-run-zts> machine # [ 54.956783] load_module+0x4e8c/0x6a30
vm-test-run-zts> machine # [ 54.956796] ? __pfx_load_module+0x10/0x10
vm-test-run-zts> machine # [ 54.956800] ? do_anonymous_page+0x3d2/0x1870
vm-test-run-zts> machine # [ 54.956808] ? __do_sys_init_module+0x178/0x280
vm-test-run-zts> machine # [ 54.956815] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956820] ? common_interrupt+0x13/0xa0
vm-test-run-zts> machine # [ 54.956827] ? __pfx__raw_spin_lock+0x10/0x10
vm-test-run-zts> machine # [ 54.956835] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956840] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956845] ? find_vmap_area+0x141/0x180
vm-test-run-zts> machine # [ 54.956853] ? __do_sys_init_module+0x238/0x280
vm-test-run-zts> machine # [ 54.956857] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956862] __do_sys_init_module+0x238/0x280
vm-test-run-zts> machine # [ 54.956867] ? __pfx___do_sys_init_module+0x10/0x10
vm-test-run-zts> machine # [ 54.956871] ? __count_memcg_events+0xdd/0x340
vm-test-run-zts> machine # [ 54.956880] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956885] ? srso_alias_return_thunk+0x5/0xfbef5
vm-test-run-zts> machine # [ 54.956890] ? fpregs_assert_state_consistent+0x20/0xa0
vm-test-run-zts> machine # [ 54.956898] do_syscall_64+0xb7/0x200
vm-test-run-zts> machine # [ 54.956905] entry_SYSCALL_64_after_hwframe+0x77/0x7f
vm-test-run-zts> machine # [ 54.956911] RIP: 0033:0x7f026728f29e
vm-test-run-zts> machine # [ 54.956918] Code: 48 8b 0d 75 5b 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 5b 0d 00 f7 d8 64 89 01 48
vm-test-run-zts> machine # [ 54.956922] RSP: 002b:00007fffb18f2d38 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
vm-test-run-zts> machine # [ 54.956929] RAX: ffffffffffffffda RBX: 0000559feab6ede0 RCX: 00007f026728f29e
vm-test-run-zts> machine # [ 54.956932] RDX: 00007f026684c304 RSI: 000000000651ba78 RDI: 00007f025f2d1010
vm-test-run-zts> machine # [ 54.956936] RBP: 00007fffb18f2d80 R08: 0000000000000000 R09: 0000000000000000
vm-test-run-zts> machine # [ 54.956939] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f025f2d1010
vm-test-run-zts> machine # [ 54.956942] R13: 00007f026684c304 R14: 0000000000020000 R15: 0000559feab6e540
vm-test-run-zts> machine # [ 54.956950] </TASK>
To build the package / run the tests, simply fetch and run on a KVM-enabled machine with Nix installed:
nix build -L
nix flake check -L
I haven't tried this on a non-NixOS machine, but it should work. Disabling / enabling KASAN is simply a case of commenting out the boot.kernelPatches / boot.kernelParams in test.nix.
If someone could help unblock me with the module loading error, I can look into integrating with a GitHub workflow + maybe a Nix binary cache such as https://www.cachix.org/ as next steps?
How Has This Been Tested?
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Performance enhancement (non-breaking change which improves efficiency)
- [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
- [X] Quality assurance (non-breaking change which makes the code more robust against bugs)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
- [ ] Documentation (a change to man pages or other documentation)
Checklist:
- [ ] My code follows the OpenZFS code style requirements.
- [ ] I have updated the documentation accordingly.
- [ ] I have read the contributing document.
- [ ] I have added tests to cover my changes.
- [ ] I have run the ZFS Test Suite with this change applied.
- [X] I have attempted to run ZFS Test Suite
- [ ] All commit messages are properly formatted and contain
Signed-off-by.