bcachefs
bcachefs copied to clipboard
Deadlock on `evacuate` [1e194b3a5924]
1e194b3a592499e9ec00325091386aa8f995aea3
dmesg w/ faddr2line https://gist.github.com/ojab/9134d77a5802e12e03297fa02aac352a
$ sudo bcachefs device evacuate /dev/sdd1
Setting /dev/sdd1 readonly
0% complete: current position btree extents:1120691:8
FWIW I've done sudo bcachefs device evacuate /dev/sdc1
multiple times during that, but AFAIR it isn't affecting anthing
Generic info Provide the output of:
$ bcachefs fs usage -h /mnt/bcachefs/
Filesystem: 360fc60c-8c44-4f3e-9cc4-fbaeee9e7c3b
Size: 34.8 TiB
Used: 17.5 TiB
Online reserved: 7.13 GiB
Data type Required/total Devices
btree: 1/3 [sde1 sdc1 sdd1] 1.50 MiB
btree: 1/3 [sda1 sde1 sdc1] 92.9 GiB
btree: 1/3 [sda1 sde1 sdd1] 89.5 GiB
user: 1/1 [sdd1] 1.34 GiB
user: 1/1 [sde1] 4.58 TiB
user: 1/2 [sde1 sdc1] 21.7 MiB
user: 1/1 [sda1] 12.2 TiB
user: 1/1 [sdc1] 1.34 GiB
user: 1/2 [sda1 sde1] 476 GiB
user: 1/2 [sde1 sdd1] 22.0 MiB
cached: 1/1 [sde1] 274 MiB
cached: 1/1 [sdd1] 5.00 MiB
cached: 1/1 [sdc1] 4.97 MiB
hdd.hdd1 (device 0): sda1 rw
data buckets fragmented
free: 0 B 11635400
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 60.8 GiB 126138 811 MiB
user: 12.5 TiB 26112116 137 MiB
cached: 0 B 0
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 18.1 TiB 37881853
hdd.hdd2 (device 1): sde1 rw
data buckets fragmented
free: 0 B 13820422
sb: 3.00 MiB 4 1020 KiB
journal: 8.00 GiB 8192
btree: 60.8 GiB 63668 1.38 GiB
user: 4.81 TiB 5048309 49.9 MiB
cached: 274 MiB 330
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 1
erasure coded: 0 B 0
capacity: 18.1 TiB 18940926
sdd.ssd1 (device 2): sdc1 rw
data buckets fragmented
free: 0 B 406390
sb: 3.00 MiB 7 508 KiB
journal: 1.82 GiB 3726
btree: 31.0 GiB 64045 321 MiB
user: 1.35 GiB 2770 156 KiB
cached: 4.97 MiB 10
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 233 GiB 476948
sdd.ssd1 (device 3): sdd1 ro
data buckets fragmented
free: 0 B 408351
sb: 3.00 MiB 7 508 KiB
journal: 1.82 GiB 3726
btree: 29.8 GiB 62083 484 MiB
user: 1.35 GiB 2771
cached: 5.00 MiB 10
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 233 GiB 476948
$ bcachefs show-super /dev/sdc1
External UUID: 360fc60c-8c44-4f3e-9cc4-fbaeee9e7c3b
Internal UUID: bc05affd-9fd1-4eb5-b497-3f7956ac57d2
Device index: 2
Label:
Version: 1.1: snapshot_skiplists
Version upgrade complete: 1.1: snapshot_skiplists
Oldest version on disk: 0.29: snapshot_trees
Created: Fri Jun 16 22:38:16 2023
Sequence number: 247
Superblock size: 5768
Clean: 0
Devices: 4
Sections: members,replicas_v0,quota,disk_groups,clean,journal_seq_blacklist,journal_v2,counters
Features: zstd,journal_seq_blacklist_v3,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features: alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done
Options:
block_size: 4.00 KiB
btree_node_size: 256 KiB
errors: continue [ro] panic
metadata_replicas: 3
data_replicas: 1
metadata_replicas_required: 1
data_replicas_required: 1
encoded_extent_max: 64.0 KiB
metadata_checksum: none crc32c [crc64] xxhash
data_checksum: none [crc32c] crc64 xxhash
compression: [none] lz4 gzip zstd
background_compression: none lz4 gzip [zstd]
str_hash: crc32c crc64 [siphash]
metadata_target: ssd
foreground_target: ssd
background_target: hdd
promote_target: ssd
erasure_code: 0
inodes_32bit: 1
shard_inode_numbers: 1
inodes_use_key_cache: 1
gc_reserve_percent: 5
gc_reserve_bytes: 0 B
root_reserve_percent: 0
wide_macs: 0
acl: 1
usrquota: 1
grpquota: 1
prjquota: 1
journal_flush_delay: 1000
journal_flush_disabled: 0
journal_reclaim_delay: 100
journal_transaction_names: 1
version_upgrade: [compatible] incompatible none
nocow: 0
members (size 232):
Device: 0
UUID: 62f3139c-4515-4e6a-9aa3-24f598263ece
Size: 18.1 TiB
Bucket size: 512 KiB
First bucket: 0
Buckets: 37881853
Last mount: Tue Jul 11 06:09:13 2023
State: rw
Label: hdd1 (1)
Data allowed: journal,btree,user
Has data: journal,btree,user
Discard: 0
Freespace initialized: 1
Device: 1
UUID: 4c1c7eff-f1e9-44b8-bcac-186fb4aa2367
Size: 18.1 TiB
Bucket size: 1.00 MiB
First bucket: 0
Buckets: 18940926
Last mount: Tue Jul 11 06:09:13 2023
State: rw
Label: hdd2 (2)
Data allowed: journal,btree,user
Has data: journal,btree,user,cached
Discard: 0
Freespace initialized: 1
Device: 2
UUID: 145ea1b5-b6c6-4d13-bdee-1d482d29758f
Size: 233 GiB
Bucket size: 512 KiB
First bucket: 0
Buckets: 476948
Last mount: Tue Jul 11 06:09:13 2023
State: rw
Label: ssd1 (7)
Data allowed: journal,btree,user
Has data: journal,btree,user,cached
Discard: 0
Freespace initialized: 1
Device: 3
UUID: 933d8443-8d62-403a-9ef2-e0a3b3fe8c42
Size: 233 GiB
Bucket size: 512 KiB
First bucket: 0
Buckets: 476948
Last mount: Tue Jul 11 06:09:13 2023
State: ro
Label: ssd1 (7)
Data allowed: journal,btree,user
Has data: btree,user,cached
Discard: 0
Freespace initialized: 1
$ cat /sys/fs/bcachefs/360fc60c-8c44-4f3e-9cc4-fbaeee9e7c3b/dev-0/alloc_debug
buckets sectors fragmented
free 11635400 0 0
sb 7 6152 1016
journal 8192 8388608 0
btree 126138 127505408 1659904
user 26112116 26738528529 280073
cached 0 0 0
parity 0 0 0
stripe 0 0 0
need_gc_gens 0 0 0
need_discard 0 0 0
ec 0
reserves:
stripe 1183834
normal 591931
copygc 28
btree 14
btree_copygc 0
reclaim 0
freelist_wait empty
open buckets allocated 1024
open buckets this dev 407
open buckets total 1024
open_buckets_wait waiting
open_buckets_btree 1016
open_buckets_user 7
buckets_to_invalidate 0
btree reserve cache 0
$ sudo cat /sys/kernel/debug/bcachefs/360fc60c-8c44-4f3e-9cc4-fbaeee9e7c3b/btree_transactions
771 btree_node_write_work
backtrace:
[<0>] bch2_btree_update_start+0x89a/0x9a0 [bcachefs]
[<0>] bch2_btree_split_leaf+0x56/0x1c0 [bcachefs]
[<0>] bch2_trans_commit_error+0x84/0x580 [bcachefs]
[<0>] __bch2_trans_commit+0x4ff/0x840 [bcachefs]
[<0>] __bch2_btree_node_update_key+0x47f/0x790 [bcachefs]
[<0>] bch2_btree_node_update_key+0x363/0x3c0 [bcachefs]
[<0>] bch2_btree_node_update_key_get_iter+0x170/0x180 [bcachefs]
[<0>] btree_node_write_work+0x153/0x320 [bcachefs]
[<0>] process_one_work+0x220/0x430
[<0>] worker_thread+0x4a/0x3f0
[<0>] kthread+0xf3/0x120
[<0>] ret_from_fork+0x22/0x30
986 btree_node_write_work
backtrace:
[<0>] bch2_btree_update_start+0x89a/0x9a0 [bcachefs]
[<0>] bch2_btree_split_leaf+0x56/0x1c0 [bcachefs]
[<0>] bch2_trans_commit_error+0x84/0x580 [bcachefs]
[<0>] __bch2_trans_commit+0x4ff/0x840 [bcachefs]
[<0>] __bch2_btree_node_update_key+0x47f/0x790 [bcachefs]
[<0>] bch2_btree_node_update_key+0x363/0x3c0 [bcachefs]
[<0>] bch2_btree_node_update_key_get_iter+0x170/0x180 [bcachefs]
[<0>] btree_node_write_work+0x153/0x320 [bcachefs]
[<0>] process_one_work+0x220/0x430
[<0>] worker_thread+0x4a/0x3f0
[<0>] kthread+0xf3/0x120
[<0>] ret_from_fork+0x22/0x30
1178 bch2_move_btree
backtrace:
[<0>] bch2_btree_update_start+0x89a/0x9a0 [bcachefs]
[<0>] bch2_btree_node_rewrite+0x59/0x3a0 [bcachefs]
[<0>] bch2_move_btree.constprop.0+0x373/0x550 [bcachefs]
[<0>] bch2_data_job+0xf6/0x290 [bcachefs]
[<0>] bch2_data_thread+0x2f/0x60 [bcachefs]
[<0>] kthread+0xf3/0x120
[<0>] ret_from_fork+0x22/0x30
7479 bch2_copygc_thread
backtrace:
[<0>] bch2_kthread_io_clock_wait+0xd1/0x160 [bcachefs]
[<0>] bch2_copygc_thread+0x273/0x370 [bcachefs]
[<0>] kthread+0xf3/0x120
[<0>] ret_from_fork+0x22/0x30
9952 btree_update_nodes_written
backtrace:
[<0>] bch2_btree_update_start+0x89a/0x9a0 [bcachefs]
[<0>] bch2_btree_split_leaf+0x56/0x1c0 [bcachefs]
[<0>] bch2_trans_commit_error+0x84/0x580 [bcachefs]
[<0>] __bch2_trans_commit+0x4ff/0x840 [bcachefs]
[<0>] __bch2_btree_write_buffer_flush+0x339/0x960 [bcachefs]
[<0>] bch2_trans_commit_error+0x17a/0x580 [bcachefs]
[<0>] __bch2_trans_commit+0x4ff/0x840 [bcachefs]
[<0>] btree_update_nodes_written+0x5a4/0x7a0 [bcachefs]
[<0>] btree_interior_update_work+0x55/0x60 [bcachefs]
[<0>] process_one_work+0x220/0x430
[<0>] worker_thread+0x4a/0x3f0
[<0>] kthread+0xf3/0x120
[<0>] ret_from_fork+0x22/0x30
Ok, it looks like the patch to add a reserve for reclaim was insufficient - I'll dig more.
Again, both ssds are labeled ssd.ssd1