coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

`dd`: `status=progress` can slow if `ibs`, `obs`, or `bs` are large

Open ndd7xv opened this issue 3 years ago • 1 comments

When running coreutil's dd status=progress with ibs, obs, or bs set to a large amount (>=1G), updating progress takes (potentially much) longer than a second. GNU's dd status=progress consistently updates every second for me (Fedora 34 x86_64).

The time it takes coreutil's dd to update the progress correlates with how large ibs/obs/bs is. Personally,

./target/release/dd status=progress bs=3G < /dev/zero > /dev/null

takes around 3-3.5 seconds for the progress bar to update, but setting

./target/release/dd status=progress bs=20G < /dev/zero > /dev/null

takes 10-15 seconds before a progress bar even shows up, at which point the progress bar continues updating at that interval.

ndd7xv avatar Feb 07 '22 22:02 ndd7xv

I looked into this, and there are 2 things impacting this. One thing to keep in mind is that both Coreutils dd and GNU dd only report output after a block as been written to the output, so the difference is down to the fact that Coreutils dd takes longer to write blocks.

First, Coreutils dd is slower than GNU dd by a significant margin, especially with large blocks. I've started to take steps to correct that in #3600, but other improvements will require a bit more effort.

Second, GNU dd actually uses a maximum block size of 2G, though that may depend on the system. It will silently default to using that block size even if you tell it to use 20G, whereas Coreutils dd today will not. In my testing, /dev/zero would only give me slightly less than 2G per read anyways, so since you are not using iflag=fullblock what's actually happening with the large bs=20G copy is that it just reads and copies 2G of data for each block, but Coreutils dd is allocating a full 20G and then discarding it each time, whereas GNU dd just needs to allocate 2G.

The concrete work item here is to implement a maximum block size for dd.

I wasn't able to find a reason for that max block size in the GNU dd docs, and I haven't looked at the code because I haven't checked if we need to be concerned about licensing.

arcuru avatar Jun 07 '22 18:06 arcuru