bcachefs icon indicating copy to clipboard operation
bcachefs copied to clipboard

Performance degradation in recent versions.

Open 0xfk0 opened this issue 3 months ago • 6 comments

When I installed bcachefs first time 1.5 years ago I have performance like 1GB/s for file reading and writing (with 8MB blocks, with O_DIRECT flag).

What I have now:

sysop@saturn:~/llm/llama.cpp/build$ dd if=../../granite-4.0-h-small-GGUF/granite-4.0-h-small-UD-Q8_K_XL+.gguf of=/dev/null iflag=direct bs=8M  status=progress
37992005632 bytes (38 GB, 35 GiB) copied, 102 s, 372 MB/s

sysop@saturn:~/llm/llama.cpp/build$ dd if=../../granite-4.0-h-small-GGUF/granite-4.0-h-small-UD-Q8_K_XL.gguf of=/dev/null  status=progress
9383866880 bytes (9.4 GB, 8.7 GiB) copied, 194 s, 48.4 MB/s

Hardware is the same.

saturn ~ # lsblk 
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0   2.7T  0 disk 
└─sda1        8:1    0     2T  0 part 
sdb           8:16   0   2.7T  0 disk 
└─sdb1        8:17   0     2T  0 part 
sdc           8:32   0   2.7T  0 disk 
└─sdc1        8:33   0     2T  0 part 
sdd           8:48   0   2.7T  0 disk 
└─sdd1        8:49   0     2T  0 part 
...
nvme0n1     259:0    0 931.5G  0 disk 
├─nvme0n1p1 259:1    0   512G  0 part /.bcachefs
nvme1n1     259:3    0 931.5G  0 disk 
├─nvme1n1p1 259:4    0   512G  0 part 

NVMe disks: Samsung SSD 970 EVO Plus 1TB Hard disks: ST3000DM007-1WY10G

I checked smart, and see, that there is neither errors, nor overheating.

Before the tests I dropped file from page cache with the command:

$ dd if=../../granite-4.0-h-small-GGUF/granite-4.0-h-small-UD-Q8_K_XL.gguf iflag=nocache count=0

Files reside on NVMe disks (I see with "dstat" program, that during the tests data is read only from NVMe disks).

File system information:

saturn ~ # uname -a

Linux saturn 6.12.9-x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jan 15 12:20:37 -00 2025 x86_64 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz GenuineIntel GNU/Linux

saturn ~ # bcachefs show-super /dev/sda1

External UUID:                              432bf02b-93bb-47fb-b0a1-779d35ee44f9
Internal UUID:                              56d67147-d722-484c-832e-7c454f1caeee
Magic number:                               c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                               2
Label:                                      
Version:                                    1.13: (unknown version)
Version upgrade complete:                   1.13: (unknown version)
Oldest version on disk:                     1.3: rebalance_work
Created:                                    Thu Feb 22 17:24:06 2024
Sequence number:                            277
Time of last write:                         Thu Oct  9 01:39:18 2025
Superblock size:                            6152
Clean:                                      0
Devices:                                    6
Sections:                                   members_v1,crypt,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                   journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                            alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                               4.00 KiB
  btree_node_size:                          256 KiB
  errors:                                   continue [ro] panic 
  metadata_replicas:                        2
  data_replicas:                            2
  metadata_replicas_required:               1
  data_replicas_required:                   1
  encoded_extent_max:                       64.0 KiB
  metadata_checksum:                        none [crc32c] crc64 xxhash 
  data_checksum:                            none crc32c crc64 [xxhash] 
  compression:                              none
  background_compression:                   none
  str_hash:                                 crc32c crc64 [siphash] 
  metadata_target:                          none
  foreground_target:                        sd
  background_target:                        hd
  promote_target:                           sd
  erasure_code:                             0
  inodes_32bit:                             1
  shard_inode_numbers:                      1
  inodes_use_key_cache:                     1
  gc_reserve_percent:                       8
  gc_reserve_bytes:                         0 B
  root_reserve_percent:                     0
  wide_macs:                                0
  acl:                                      1
  usrquota:                                 0
  grpquota:                                 0
  prjquota:                                 0
  journal_flush_delay:                      1000
  journal_flush_disabled:                   0
  journal_reclaim_delay:                    100
  journal_transaction_names:                1
  version_upgrade:                          [compatible] incompatible none 
  nocow:                                    0

members_v2 (size 880):
Device:                                     0
  Label:                                    0 (1)
  UUID:                                     15938775-b454-432a-8c23-b214a99d5353
  Size:                                     512 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  1048576
  Last mount:                               Thu Oct  9 01:38:43 2025
  Last superblock write:                    277
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     1
  Label:                                    1 (2)
  UUID:                                     c045a562-e947-4327-8789-d62014170e3f
  Size:                                     512 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  1048576
  Last mount:                               Thu Oct  9 01:38:43 2025
  Last superblock write:                    277
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     2
  Label:                                    a (4)
  UUID:                                     2d013780-3660-4d95-8ac5-e0d57955a44b
  Size:                                     2.00 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  4194304
  Last mount:                               Thu Oct  9 01:38:43 2025
  Last superblock write:                    277
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     3
  Label:                                    b (5)
  UUID:                                     37a5923c-a3b5-4661-9735-133c4e0a6b5b
  Size:                                     2.00 TiB
  read errors:                              7
  write errors:                             2423311
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  4194304
  Last mount:                               Thu Oct  9 01:38:43 2025
  Last superblock write:                    277
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     4
  Label:                                    c (6)
  UUID:                                     56e6c056-443b-4839-81c2-7685d9d89d37
  Size:                                     2.00 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  4194304
  Last mount:                               Thu Oct  9 01:38:43 2025
  Last superblock write:                    277
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     5
  Label:                                    d (7)
  UUID:                                     7d4af9fd-0738-4295-962f-937a42f98ecc
  Size:                                     2.00 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  4194304
  Last mount:                               Thu Oct  9 01:38:43 2025
  Last superblock write:                    277
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1

errors (size 40):
ptr_to_missing_backpointer                  46861282        Wed Jan 15 15:39:16 2025
inode_unreachable                           2               Wed Jan 15 14:37:58 2025

saturn ~ # free

               total        used        free      shared  buff/cache   available
Mem:       131749572    10652672    37680424      100452    83416476   119743896
Swap:      268435452         256   268435196

saturn ~ # vmstat

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0    256 37530324   5352 83561264    0    0    51   100    6    5  1  0 99  0  0

NVMe disk performance:

saturn ~ # dd if=/dev/nvme0n1 of=/dev/null bs=16k iflag=direct status=progress
6775668736 bytes (6.8 GB, 6.3 GiB) copied, 6 s, 1.1 GB/s^C

saturn ~ # dd if=/dev/nvme0n1 of=/dev/null bs=128k iflag=direct status=progress
14116192256 bytes (14 GB, 13 GiB) copied, 6 s, 2.4 GB/s^C

saturn ~ # dd if=/dev/nvme0n1 of=/dev/null bs=1M iflag=direct status=progress
16934502400 bytes (17 GB, 16 GiB) copied, 7 s, 2.4 GB/s^C

saturn ~ # dd if=/dev/nvme0n1 of=/dev/null bs=8M iflag=direct status=progress
16181624832 bytes (16 GB, 15 GiB) copied, 7 s, 2.3 GB/s^C

If I put a file on ext4 filesystem on a single NVMe drive, I getting following results:

saturn ~ # dd if=/mnt/granite-4.0-h-small-UD-Q8_K_XL.gguf iflag=nocache count=0
0+0 records in
0+0 records out
0 bytes copied, 2.686e-05 s, 0.0 kB/s
saturn ~ # sync
saturn ~ # dd if=/mnt/granite-4.0-h-small-UD-Q8_K_XL.gguf iflag=direct of=/dev/null bs=8M status=progress
37312528384 bytes (37 GB, 35 GiB) copied, 12 s, 3.1 GB/s
4538+1 records in
4538+1 records out
38071217056 bytes (38 GB, 35 GiB) copied, 12.2438 s, 3.1 GB/s

Feel the difference, 300MB/s vs 3GB/s!

What I previously found:

  1. Performnce on unencrypted disks are much better (with bcachefs), but unencrypted disks are not an option in modern world... I see no other reason (CPU load << 100%) except of poor communication between block device level, encryption layer and file system layer. dm-crypt works much better, but doesn't provides integrity checking.

  2. dm-integrity is unusable and exetremely slow (this is reason why dm-raid is completely useless);

  3. btrfs has no encryption, but can be used in conjunction with dm-raid and dm-crypt (btrfs is for integrity): this makes possible only building RAID01 arrays, but not RAID10, and such configuration has an issue with unequal disk load (so performance degradation)

  4. dm-cache not caches recently accessed files, what I want, bcachefs is perfect in that.

  5. ZFS eats all of the memory and doesn't provide write cache.

  6. Hardware encryption in NVMe SSD disks is bullshit (you may just reset computer, boot it from other media, and find that disks are unencrypted!)

I cant just replace bcachefs with other system, but bcachefs is almost unusable in current state too. :-( Generally, I want use cheap low-performance consumer grade HDD disks for storage, and NVMe for caching recent files. Also encryption, integrity (disks might corrupt data) and raid. Switching to bcache + dm-raid + dm-crypt + btrfs not an option too (RAID01, unequal disk load).

As I told before, I see the problem of low performance is not in hardware, not in high load, but in poor communication between distinct layers (file system, raid, encryption, block device). Same situation exist with dm-integrity module.

I think, for working with AI (LLMs) bcachefs need improvements at least in area of reading files in large blocks. Currently I need copy files not with cp, but with dd iflag=odirect, because cp is six times slower! There is no such problem with ext4, btrfs, etc... There is problem in communication between bcachefs and page cache.

And event with O_DIRECT flag I see today performance is three times lower, then I initially switched to bcachefs 1.5 years ago (this was first version of mainline kernel having bcachefs support). I would never have started using bcachefs if I had seen such poor performance initially (I have two NVMe in parallel, and expect 5GB/s, but see performance comparable to a performance of consumer grade hard disk!)

I want to ask:

  1. May be bcachefs has some knobs for fine tuning and performance can be improved?
  2. How can I profile performance issues in my case? Linux perf, something other?

I know, my version is too old for now, but I afraid of upgrading: there will be no way to downgrade (or I need backupt disk images), and I see on other computer with 6.15 kernel even worse performance and significant delays (few seconds!) on some disk opertions. I see no such problems with 6.12. :-(

0xfk0 avatar Oct 11 '25 09:10 0xfk0

btrfs has no encryption, but can be used in conjunction with dm-raid and dm-crypt (btrfs is for integrity): this makes possible only building RAID01 arrays, but not RAID10, and such configuration has an issue with unequal disk load (so performance degradation)

It can also be used with just dmcrypt leaving integrity and raid to btrfs, that supports devices with unequal sizes. After all putting btrfs on top raid will tell you if there is corruption, but since btrfs does not see individual devices, it can't correct anything. so it's pretty much useless.

neVERberleRfellerER avatar Oct 11 '25 17:10 neVERberleRfellerER

I found interesting thing. When I reading file from disk (2xNVMe and 4xHDD), dstat program shows activity for all six disks. But file is definitely fully cached on NVMe.

I can guess, that low performance is caused by the fact, that bcachefs internally need to synchronize IO for all disks, and HDD causing large delays. I think this should not happen: or HDD must not be used in particular case, or synchronization should not cause waiting.

$ dstat --bw -D sda,sdb,sdc,sdd,nvme0n1,nvme1n1
--total-cpu-usage-- --dsk/sda-----dsk/sdb-----dsk/sdc-----dsk/sdd---dsk/nvme0n1-dsk/nvme1n1 -net/total->
usr sys idl wai stl| read  writ: read  writ: read  writ: read  writ: read  writ: read  writ| recv  send>
  4  17  79   0   0|4644k    0 :3492k    0 :5200k    0 :4408k    0 : 221M 2428k: 142M 2428k|7630B  378B>
  2  17  81   0   0|4244k    0 :4436k    0 :4892k    0 :4432k    0 : 216M 3952k: 149M 3952k|   0     0 >
  3  17  81   0   0|5204k    0 :3648k    0 :5436k    0 :4100k    0 : 214M 5096k: 154M 5096k|6802B  324B>
  3  17  81   0   0|4140k    0 :4440k    0 :4612k    0 :5232k    0 : 197M 2712k: 170M 2712k| 237B  172B>
  6  17  77   0   0|4660k    0 :4608k    0 :4408k    0 :4604k    0 : 203M 6940k: 166M 6940k|7946B  324B>
  4  19  77   0   0|4564k    0 :4420k    0 :4880k    0 :4508k    0 : 231M   11M: 150M   11M|   0     0 >
  3  17  81   0   0|4512k    0 :4584k    0 :5380k    0 :4228k    0 : 201M 2328k: 175M 2328k|6949B  324B>
  3  16  81   0   0|5416k    0 :3972k    0 :4180k    0 :3612k    0 : 194M 2168k: 171M 2168k|   0     0 >
  3  17  80   0   0|4276k    0 :3396k    0 :3772k    0 :5012k    0 : 211M 2276k: 160M 2276k|2556B  108B>
  4  16  79   0   0|4436k    0 :5004k    0 :4812k    0 :4144k    0 : 156M 2364k: 203M 2364k|6126B  312B>
  5  17  78   0   0|4700k    0 :4952k    0 :5320k    0 :5160k    0 : 183M 2212k: 186M 2212k|  10k  432B>
  2  18  80   0   0|4920k    0 :4292k    0 :4668k    0 :5320k    0 : 212M 2056k: 181M 2056k|   0     0 >
  3  17  80   0   0|4256k    0 :4364k    0 :5368k    0 :4736k    0 : 194M 2828k: 185M 2828k|  13k 7887B>
  4  17  79   0   0|3864k    0 :3956k    0 :5080k    0 :5224k    0 : 191M 3472k: 173M 3472k| 810k   16k>
  3  17  80   0   0|4044k    0 :4320k    0 :4876k    0 :3680k    0 : 209M 3160k: 163M 3160k|8958B  432B>
  6  18  75   0   0|4668k    0 :4160k    0 :4192k    0 :3548k    0 : 225M 2324k: 152M 2324k|   0     0 >
  5  18  77   0   0|4116k    0 :5364k    0 :4728k    0 :4020k    0 : 206M 2320k: 167M 2320k|  10k  486B>
  7  17  76   0   0|3740k    0 :3900k    0 :5096k    0 :5352k    0 : 181M 2260k: 191M 2260k|  40B   54B>
  5  17  78   0   0|4924k    0 :4072k    0 :4336k    0 :4312k    0 : 212M 2116k: 142M 2116k|2422B  292B>
  4  17  79   0   0|4716k    0 :4828k    0 :4776k    0 :4640k    0 : 241M 3056k: 129M 3056k|6221B  324B>
  4  17  80   0   0|5264k    0 :4808k    0 :5996k    0 :4872k    0 : 210M 2148k: 155M 2148k|8195B  378B>
  3  16  81   0   0|4688k    0 :4160k    0 :4620k    0 :4636k    0 : 197M 2172k: 156M 2172k|   0     0 >
  4  18  79   0   0|3680k    0 :3484k    0 :3928k    0 :5156k    0 : 179M 2668k: 203M 2668k|8382B  378B>
  5  18  78   0   0|5444k    0 :4544k    0 :4288k    0 :5688k    0 : 196M 2240k: 180M 2240k|   0     0 >

$ dd if=~/llm/Llama-4-Scout-17B-16E-Instruct-UD-Q5_K_XL-00001-of-00002.gguf of=/dev/null bs=8M iflag=direct status=progress
20023607296 bytes (20 GB, 19 GiB) copied, 53 s, 378 MB/s

0xfk0 avatar Oct 12 '25 10:10 0xfk0

Do you have labels set properly? From your superblock it does not seem to be the case. You have foreground_target: sd But no device with label sd as far as I can see

neVERberleRfellerER avatar Oct 12 '25 12:10 neVERberleRfellerER

I found interesting things! See below. First of all, bcachefs utility prints labels incorrectly. I can see via /sys/fs interface, that labels are set in correct way.

Second, I see "cached" data not only on SSD, but on three from four HDD disks too! I think this is a source of the problem! And that is interesting, "cached" is in "has_data" record (output from bcachefs utility), but absent in "data_allowed" record.

I need somehow turn off caching on sda, sdd and sdb devices. But I not understood how can I do this.

# for x in 0 1 2 3 4 5; do cat /sys/fs/bcachefs/432bf02b-93bb-47fb-b0a1-779d35ee44f9/dev-$x/label; done
sd.0
sd.1
hd.a
hd.b
hd.c
hd.d


# for x in 0 1 2 3 4 5; do cat /sys/fs/bcachefs/432bf02b-93bb-47fb-b0a1-779d35ee44f9/dev-$x/has_data; done
journal,btree,user,cached
journal,btree,user,cached
journal,user,cached
journal,user
journal,user,cached
journal,user,cached

# cat /sys/fs/bcachefs/432bf02b-93bb-47fb-b0a1-779d35ee44f9/options/promote_target 
sd
# cat /sys/fs/bcachefs/432bf02b-93bb-47fb-b0a1-779d35ee44f9/options/foreground_target 
sd
# cat /sys/fs/bcachefs/432bf02b-93bb-47fb-b0a1-779d35ee44f9/options/background_target 
hd

 # bcachefs fs usage
Filesystem: 432bf02b-93bb-47fb-b0a1-779d35ee44f9
Size:                  9103956278272
Used:                  4834737380352
Online reserved:           131825664

Data type       Required/total  Durability    Devices
reserved:       1/2               [] 104837120
btree:          1/2             2             [nvme0n1p1 nvme1n1p1] 71060422656
user:           1/1             1             [sda1]           32028094464
user:           1/1             1             [sdb1]          116915191808
user:           1/2             2             [sdc1 sdd1]     732631154688
user:           1/1             1             [sdd1]           87099154432
user:           1/2             2             [sda1 sdc1]     852578394112
user:           1/2             2             [sdc1 sdb1]     626048761856
user:           1/2             2             [nvme0n1p1 nvme1n1p1] 2209923072
user:           1/2             2             [sda1 sdd1]     736734560256
user:           1/2             2             [sda1 sdb1]     783675981824
user:           1/2             2             [sdd1 sdb1]     793408905216
cached:         1/1             1             [nvme0n1p1]     217723211776
cached:         1/1             1             [nvme1n1p1]     280743837696
cached:         1/1             1             [sda1]                524288
cached:         1/1             1             [sdd1]                450560
cached:         1/1             1             [sdb1]               1695744

hd.a (device 2):                sda1              rw
                                data         buckets    fragmented
  free:                 719183675392         1371734
  sb:                        3149824               7        520192
  journal:                4294967296            8192
  btree:                           0               0
  user:                1218522562560         2814370  257017856000
  cached:                     438272               1         86016
  parity:                          0               0
  stripe:                          0               0
  need_gc_gens:                    0               0
  need_discard:                    0               0
  capacity:            2199023255552         4194304

hd.b (device 3):                sdc1              rw
                                data         buckets    fragmented
  free:                 801368440832         1528489
  sb:                        3149824               7        520192
  journal:                4294967296            8192
  btree:                           0               0
  user:                1105629155328         2657616  287727022080
  cached:                          0               0
  parity:                          0               0
  stripe:                          0               0
  need_gc_gens:                    0               0
  need_discard:                    0               0
  capacity:            2199023255552         4194304

hd.c (device 4):                sdd1              rw
                                data         buckets    fragmented
  free:                 719219851264         1371803
  sb:                        3149824               7        520192
  journal:                4294967296            8192
  btree:                           0               0
  user:                1218486464512         2814302  257018302464
  cached:                          0               0
  sb:                        3149824               7        520192
  journal:                4294967296            8192
  btree:                           0               0
  user:                1218482016256         2814299  257021177856
  cached:                     229376               1        294912
  parity:                          0               0
  stripe:                          0               0
  need_gc_gens:                    0               0
  need_discard:                    0               0
  capacity:            2199023255552         4194304

sd.0 (device 0):           nvme0n1p1              rw
                                data         buckets    fragmented
  free:                  27855945728           53131
  sb:                        3149824               7        520192
  journal:                4294967296            8192
  btree:                 35530211328          122519   28705030144
  user:                   1104961536            2210      53714944
  cached:               217719443456          862517  234487869440
  parity:                          0               0
  stripe:                          0               0
  need_gc_gens:                    0               0
  need_discard:                    0               0
  capacity:             549755813888         1048576

sd.1 (device 1):           nvme1n1p1              rw
                                data         buckets    fragmented
  free:                  27959754752           53329
  sb:                        3149824               7        520192
  journal:                4294967296            8192
  btree:                 35530211328          122519   28705030144
  user:                   1104961536            2216      56860672
  cached:               280738631680          862312  171361202176
  parity:                          0               0
  stripe:                          0               0
  need_gc_gens:                    0               0
  need_discard:               524288               1
  capacity:             549755813888         1048576

0xfk0 avatar Oct 12 '25 14:10 0xfk0

First of all, you are indeed running an ancient version by bcachefs standards (and your bcachefs-tools version seems even older). The bcachefs project does not really support non-latest versions or backports bugfixes currently, instead focusing on encouraging users to upgrade by making upgrades and downgrades easier and reliable. So the claim that "there will be no way to downgrade" is most definitely false — unless you enable incompatible upgrades manually, and your FS will actually use an incompatible feature from some later version, downgrades are always supported, both online and offline.

The current bcachefs version (as I'm writing this) is DKMS 1.31.7, supported with kernels 6.16-6.17. A direct upgrade to the latest DKMS version is very recommended. If you the the performance issue still reproduces on the current version, you can join the IRC channel to get help with perf tracing to understand what is going on in detail.

Your last hypothesis about a very small amount of cached data on HDDs is irrelevant, I'm afraid.

himikof avatar Oct 12 '25 15:10 himikof

# sudo zypper info bcachefs-kmp-default 
Version            : 1.32.1_k6.17.7_1-1.1
# bcachefs fs usage -h /storage/
Filesystem: b788b328-fccb-4dc0-a5a1-742cb0aeb7cf
Size:                       56.8 TiB
Used:                       50.0 TiB
Online reserved:            5.01 GiB

Data by durability desired and amount degraded:
          undegraded
2x:         49.9 TiB
cached:      834 GiB
reserved:   51.2 GiB

Device label                   Device      State          Size      Used  Use%
hdd.hdd1 (device 0):           sde         rw         3.64 TiB  2.36 TiB   64%
hdd.hdd2 (device 1):           sdf         rw         9.10 TiB  8.44 TiB   92%
hdd.hdd3 (device 2):           sda         rw         14.6 TiB  13.3 TiB   91%
hdd.hdd4 (device 3):           sdd         rw         7.28 TiB  6.38 TiB   87%
hdd.hdd5 (device 4):           sdb         rw         9.10 TiB  6.85 TiB   75%
hdd.hdd6 (device 5):           sdc         rw         16.4 TiB  12.3 TiB   75%
ssd.ssd1 (device 6):           sdg         rw          894 GiB   536 GiB   59%
ssd.ssd2 (device 7):           sdh         rw          894 GiB   567 GiB   63%

I also notice the bad performance on my setup with a recent version. The SSDs are assigned as read and write cache. Most noticeable is the degradation after heavy io. For example, after moving around 200 GiB with mkvmerge. Simple commands like ls or touch will get stuck in a syscall and turn unkillable for some time after mkvmerge has finished.

Not related to the io issue, but reflinks are very slow as well, up to over a minute.

FuchtelJockel avatar Dec 01 '25 22:12 FuchtelJockel