btrfs-progs
btrfs-progs copied to clipboard
btrfs-progs: check that device byte values in superblock match those in chunk root
The superblock of each device contains a copy of the corresponding struct btrfs_dev_item that lives in the chunk root.
Add a check that the total_bytes and bytes_used values of these two copies match.
The CI failure is caused by an old image which is created by some older kernel.
I have updated the image with this PR: https://github.com/kdave/btrfs-progs/pull/992
Which should solve the test failure for fsck/020.
But there are still some more problems need to be solved:
-
fsck/047 This requires proper repair support, your enhanced check didn't re-read the updated superblock after repair, thus failling that test case. This need some updates on your patch.
-
fsck/057 The check is not handling seed device correctly. The sprout fs can modify the old chunks on the seed device, e.g. remove the old empty SYSTEM chunk. But since the seed device is completely read-only, the device item can not be updated, thus causing check error.
The patch should not check a seed device against the sprouted fs.
Cheers Qu, I'll have a look
But there are still some more problems need to be solved:
- fsck/047 This requires proper repair support, your enhanced check didn't re-read the updated superblock after repair, thus failling that test case. This need some updates on your patch.
This wasn't quite right. The problem was that the dev_rec->byte_used value wasn't getting updated in check_device_used(). I've force-pushed the patch with this change squashed in.
- fsck/057 The check is not handling seed device correctly. The sprout fs can modify the old chunks on the seed device, e.g. remove the old empty SYSTEM chunk. But since the seed device is completely read-only, the device item can not be updated, thus causing check error. The patch should not check a seed device against the sprouted fs.
Unfortunately this wasn't right either. btrfs-check only works on one device at a time, so if the superblock is readonly on one device the metadata on it will be too. The problem was that this test does a mount, and it was the kernel causing the corruption. Cherry-picking 9516bae0d79045004f0b64b1f852d177cacee094 causes the problem to go away.
Thanks a lot for updating the analyze, my quick guess is as bad as usual.
We can merge the series when the upstream kernel got the fix. Although the CI may be problematic for a while until the CI kernel got the fix backported.
No worries, thanks Qu
Considering it's still causing problems in the CI systems (the kernel is not having the backport), I'll change the check to do a warning other than an error instead.
Qu, if you fix it so it works in the CI then please add it to devel. Thanks.
The updated version is here:
https://lore.kernel.org/linux-btrfs/[email protected]/
I'll push that version in the devel.