bcachefs icon indicating copy to clipboard operation
bcachefs copied to clipboard

Slow compressed writes and high CPU usage

Open brauliobo opened this issue 1 year ago • 3 comments

On rsyncing a backup on slow WD 5TB external drives, I see a one-thread flush operation almost always on 100% CPU

The compression used is zstd:10 and zstd:15 for another drive.

On BTRFS, using the compression I see an eventual flush with a multi-threaded CPU use on compressed-write kthreads. It usually lasts just a few seconds instead of constant 1 core use.

Also, it seems Bcachefs isn't testing the compression and is dumbly compressing everything even if it wouldn't result in any gains. For instance, media (audio, video, and images) aren't compressible, besides the obvious already compressed files.

So a lot of CPU usage can be saved if it only compresses what can be compressed.

Using Linux 6.11.3 on Archlinux image

brauliobo avatar Oct 21 '24 14:10 brauliobo

Yeah, multithreaded compression is on the wishlist; filetype detection would be good too

koverstreet avatar Oct 29 '24 02:10 koverstreet

Just to add my love for the idea of making compression threaded or just more throughput I'm using not-so-strong CPUs that got bottomed down with any more demanding compression that lvl1. It looks like cores got stuck with 'I/O wait' and not so with compression computations but I was doing some tests and (altho not looking into the source code) it seems that perhaps system and 'htop' is marking some bcachefs kthread works as 'I/O wait' not 'kernel' because all is happening inside the driver for compression algo part to finish. So yeah, I feel there's A LOT of performance to gain, just because that with compression you can got better throughput than with bare I/O if your software part can compress/deco faster than the underlying hardware (HDD especially of course)

elmystico avatar Mar 12 '25 19:03 elmystico

The simple compression detection logic by btrfs:

Files with already compressed data or with data that won’t compress well with the CPU and memory constraints of the kernel implementations are using a simple decision logic. If the first portion of data being compressed is not smaller than the original, the compression of the whole file is disabled. Unless the filesystem is mounted with compress-force in which case btrfs will try compressing every block, falling back to storing the uncompressed version for each block that ends up larger after compression. This is not optimal and subject to optimizations and further development.

Incompressible data - Compression — BTRFS documentation

ttimasdf avatar Aug 08 '25 07:08 ttimasdf