btrfs-progs icon indicating copy to clipboard operation
btrfs-progs copied to clipboard

Best practices in case of badblocks

Open Seb35 opened this issue 5 years ago • 3 comments

I have btrfs on my root filesystem above luks, and I have daily snapshots transfered to another btrfs filesystem. Recently I expanded my btrfs filesystem (online, I was impressed) and copied some quantity of data there. It appeared there is one badblock in this new space (or at least a sector with a bad health), which blocked my send/receive (see #163). I deleted the faulty file and snapshots work again, but I still have a pending bad sector. (I’m not sure why it wrote the file but cannot read the same file a few days later, but it’s not important.)

smartmontools extended self-test reports one bad sector (but a general good health) and btrfs scrub reports one uncorrectable error.

What are the best practices to deal with such case? Run badblocks -sv? Run btrfs check --repair from a live distribution? Do nothing and let btrfs manage the case itself?

I see on the wiki there is a project idea "Bad block tracking" added on 20 September 2015: was it implemented entirely or partially or did it become non-applicable because of other mechanisms?

Thanks! (And thanks for this good filesystem as a whole!)


Versions:

  • Debian 10
  • Linux 4.19
  • btrfs-progs 4.20.1-2

Seb35 avatar Sep 01 '20 19:09 Seb35

I don’t have RAID1 on the data (only on the metadata), I know it could be a partial workaround for this case.

Seb35 avatar Sep 01 '20 19:09 Seb35

Normally, drive firmware handles UNC sectors by giving IO error on read, and remapping the write to an alternate reserved sector on write. So all you have to do is replace the missing data from the bad sector, and the drive should deal with the other details. If the drive has other problems (e.g. out of remappable sectors, firmware bugs, or other hardware failure) then the remapping on write will not be possible and the drive should be replaced.

Scrub will report an error as long as the unreadable block is part of a file or metadata in the filesystem. If you remove the file containing the bad block (including all shared references in reflinks and snapshots) then scrub will no longer complain. If btrfs writes to the bad sector again, then the bad sector remapping in the drive firmware described above should take care of it.

A bad sector in a data block can be recovered by deleting the file(s) which contain the block. Overwriting the individual bad data blocks may not remove them from the filesystem because of the way btrfs extent reference counting works. All of the blocks in an extent (including all shared references) must be overwritten before the extent is removed. The easiest way to achieve that is to delete or replace the entire file. If you don't want to delete a file (e.g. because it is very large) then you may need to overwrite up to 128MB on either side of a bad block to ensure the extent which contains it is removed. It is possible to use dump-tree or metadata searches to identify the boundaries of the damaged extent, but "+/- 128MB" is good enough in most cases.

A bad sector in a metadata block is fatal, and btrfs can only survive such an event with dup or raid1* metadata to replace the lost metadata in the bad sector. btrfs will rewrite the metadata block from the surviving mirror copy, which will trigger write remapping in the drive. This is automatic and you should not need to do anything.

Zygo avatar Sep 02 '20 00:09 Zygo

Many thanks for your answer @Zygo ! I read it a long time ago, I should have thank you before.

BTW I had no issue since the file was deleted, I guess the drive firmware managed it as you explained.

Seb35 avatar Jun 14 '21 20:06 Seb35