e2fsprogs icon indicating copy to clipboard operation
e2fsprogs copied to clipboard

Badblocks fails on really large disks, can take e2fsck down as well

Open hamishmb opened this issue 2 years ago • 3 comments

Hi there,

I'm running Linux Mint 20.3 with e2fsprogs 1.45.5-2ubuntu1. I realise this isn't the latest version, so hopefully I'm reporting a bug you already found and fixed. I couldn't see a way to check easily, but couldn't find anything that looked relevant with a quick search of the commit history since early 2020 (Mint 20.3 is based on Ubuntu 20.04 LTS).

Since I got an 8TB USB backup hard drive, I found that when I run the badblocks program on it periodically, I have to specify a bigger blocksize (-b 8192) in order for it to work.

Otherwise I receive the error:

badblocks: Value too large for defined data type invalid end block (7814023168): must be 32-bit value

In and of itself, this isn't a huge issue, but when I run e2fsck on it to check the EXT4 file system, if I use the bad sector check option to launch badblocks, e2fsck just hangs forever at that point, with the drive making a brrrrrrrrrr sound, as if it is trying to launch badblocks over and over again without checking the exit value.

So I suggest that e2fsck should check and handle this return value (I can provide the specific value by running it again if you need me to), and badblocks should use 64-bit values for the number of blocks in a disk.

hamishmb avatar Apr 19 '22 14:04 hamishmb

Perhaps also worth mentioning is that it probably shouldn't be assumed that disks have 512-byte sectors any more, a lot of them use 4k sectors now.

hamishmb avatar May 05 '22 16:05 hamishmb

What I will probably do in the next major release is to deprecate the e2fsck -c and mke2fs -c options. The reason of this is something I explained a year or so ago:

I will say that for modern disks, the usefulness of badblocks has decreased significantly over time. That's because for modern-sized disks, it can often take more than 24 hours to do a full read on the entire disk surface --- and the factory testing done by HDD manufacturers is far more comprehensive.

In addition, SMART (see the smartctl package) is a much more reliable and efficient way of judging disk health.

The badblocks program was written over two decades ago, before the days of SATA, and even IDE disks, when disk controlls and HDD's were far more primitive. These days, modern HDD and SSD will do their own bad block redirection from a built-in bad block sparing pool, and the usefulness of using badblocks has been significantly decreased.

https://www.spinics.net/lists/linux-ext4/msg76847.html

If someone wants to send patches to make badblocks work better on large disks, including automatically reading the physical block size and using it to optimize how it works, that's great. But it's not high priority for me.

tytso avatar May 08 '22 23:05 tytso

Okay, that seems fair enough.

At any rate, it'd be good to keep badblocks around in its current form even if it doesn't change. I have a use case for it by running the read-write test on any HDDs I have with important data once a year, after some of them demagnetised last year, it seems.

hamishmb avatar May 15 '22 08:05 hamishmb