bcachefs icon indicating copy to clipboard operation
bcachefs copied to clipboard

new allocator with erasure coding: "error creating stripe" on unmount; following fsck’s are confused

Open leuler opened this issue 2 years ago • 1 comments

I created a filesystem on 5 disks with erasure coding, mounted it and wrote a file to it, then unmounted it. dmesg shows error creating stripe: error writing data buckets. I ran fsck, it fixed some errors. I ran fsck again, it failed an assertion: bch2_bkey_ptr_data_type: Assertion '!(ptr < s.v->ptrs || ptr >= s.v->ptrs + s.v->nr_blocks)' failed.

(The same operations work without errors without erasure coding.)

Software versions:

bcachefs commit 62cdb94 "Improve bucket_alloc tracepoints" bcachefs-tools commit 3765483 "Update bcachefs sources to f05b3c1af9 bcachefs: Improve bucket_alloc_…"

Here's the script from the session:

root@bcachefstest:~# bcachefs format --erasure_code --compression=zstd --replicas 2 \
  --metadata_checksum=crc64 --data_checksum=crc64 \
  --label=nvme /dev/vdb1 /dev/vdc1 --label=ssd /dev/vdd1 /dev/vde1 /dev/vdf1 \
  --foreground_target=nvme --promote_target=nvme --background_target=ssd
External UUID:                  5af009ee-0e96-4ac4-94bb-aef4298c397c
Internal UUID:                  b7a4c0b8-59df-4e50-97c1-352617f11678
Device index:                   4
Label:                          
Version:                        19
Oldest version on disk:         19
Created:                        Mon Mar 14 20:32:21 2022
Sequence number:                0
Superblock size:                1144
Clean:                          0
Devices:                        5
Sections:                       members,disk_groups
Features:                       new_siphash,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                

Options:
  block_size:                   512
  btree_node_size:              128k
  errors:                       continue [ro] panic 
  metadata_replicas:            2
  data_replicas:                2
  metadata_replicas_required:   1
  data_replicas_required:       1
  encoded_extent_max:           64k
  metadata_checksum:            none crc32c [crc64] xxhash 
  data_checksum:                none crc32c [crc64] xxhash 
  compression:                  none lz4 gzip [zstd] 
  background_compression:       [none] lz4 gzip zstd 
  str_hash:                     crc32c crc64 [siphash] 
  metadata_target:              none
  foreground_target:            nvme
  background_target:            ssd
  promote_target:               nvme
  erasure_code:                 1
  inodes_32bit:                 1
  shard_inode_numbers:          1
  inodes_use_key_cache:         1
  gc_reserve_percent:           8
  gc_reserve_bytes:             0
  root_reserve_percent:         0
  wide_macs:                    0
  acl:                          1
  usrquota:                     0
  grpquota:                     0
  prjquota:                     0
  journal_flush_delay:          1000
  journal_flush_disabled:       0
  journal_reclaim_delay:        100
  journal_transaction_names:    1

members (size 288):
  Device:                       0
    UUID:                       5af6ef3b-646d-4d2c-bf7b-1e2289b4501d
    Size:                       203M
    Bucket size:                128k
    First bucket:               0
    Buckets:                    1630
    Last mount:                 (never)
    State:                      rw
    Group:                      nvme (0)
    Data allowed:               journal,btree,user
    Has data:                   (none)
    Discard:                    0
    Freespace initialized:      0
  Device:                       1
    UUID:                       e6395e22-1999-408b-bb4a-5d6e25451ba2
    Size:                       203M
    Bucket size:                128k
    First bucket:               0
    Buckets:                    1630
    Last mount:                 (never)
    State:                      rw
    Group:                      nvme (0)
    Data allowed:               journal,btree,user
    Has data:                   (none)
    Discard:                    0
    Freespace initialized:      0
  Device:                       2
    UUID:                       aa85a2b4-9cb3-4f21-80e9-e4723381bf51
    Size:                       203M
    Bucket size:                128k
    First bucket:               0
    Buckets:                    1630
    Last mount:                 (never)
    State:                      rw
    Group:                      ssd (1)
    Data allowed:               journal,btree,user
    Has data:                   (none)
    Discard:                    0
    Freespace initialized:      0
  Device:                       3
    UUID:                       7f58bf09-457e-4171-8246-4a78e27d9422
    Size:                       203M
    Bucket size:                128k
    First bucket:               0
    Buckets:                    1630
    Last mount:                 (never)
    State:                      rw
    Group:                      ssd (1)
    Data allowed:               journal,btree,user
    Has data:                   (none)
    Discard:                    0
    Freespace initialized:      0
  Device:                       4
    UUID:                       5389e99c-5823-49e9-a615-68034d748956
    Size:                       203M
    Bucket size:                128k
    First bucket:               0
    Buckets:                    1630
    Last mount:                 (never)
    State:                      rw
    Group:                      ssd (1)
    Data allowed:               journal,btree,user
    Has data:                   (none)
    Discard:                    0
    Freespace initialized:      0
bch2_fs_open() 
bch2_read_super() 
bch2_read_super() ret 0
bch2_read_super() 
bch2_read_super() ret 0
bch2_read_super() 
bch2_read_super() ret 0
bch2_read_super() 
bch2_read_super() ret 0
bch2_read_super() 
bch2_read_super() ret 0
bch2_fs_alloc() 
bch2_fs_journal_init() 
bch2_fs_journal_init() ret 0
bch2_fs_btree_cache_init() 
bch2_fs_btree_cache_init() ret 0
bch2_fs_encryption_init() 
bch2_fs_encryption_init() ret 0
__bch2_fs_compress_init() 
__bch2_fs_compress_init() ret 0
bch2_dev_alloc() 
bch2_dev_alloc() ret 0
bch2_dev_alloc() 
bch2_dev_alloc() ret 0
bch2_dev_alloc() 
bch2_dev_alloc() ret 0
bch2_dev_alloc() 
bch2_dev_alloc() ret 0
bch2_dev_alloc() 
bch2_dev_alloc() ret 0
bch2_fs_alloc() ret 0
initializing new filesystem
going read-write
marking superblocks
initializing freespace
initializing freespace
done initializing freespace
reading snapshots table
reading snapshots done
__bch2_fs_compress_init() 
__bch2_fs_compress_init() ret 0
mounted with opts: metadata_replicas=2,data_replicas=2,metadata_checksum=crc64,data_checksum=crc64,compression=zstd,foreground_target=nvme,background_target=ssd,promote_target=nvme,erasure_code,noinodes_use_key_cache,verbose
bch2_fs_open() ret 0
shutting down
flushing journal and stopping allocators
flushing journal and stopping allocators complete
marking filesystem clean
shutdown complete
root@bcachefstest:~# mount.bcachefs.sh -o noatime 5af009ee-0e96-4ac4-94bb-aef4298c397c /mnt
root@bcachefstest:~# dd if=/dev/urandom of=/mnt/a bs=1k count=200
200+0 records in
200+0 records out
204800 bytes (205 kB, 200 KiB) copied, 0.00176666 s, 116 MB/s
root@bcachefstest:~# sync
root@bcachefstest:~# dmesg | tail
[    3.026849] snd_hda_codec_generic hdaudioC0D0:    hp_outs=0 (0x0/0x0/0x0/0x0/0x0)
[    3.026851] snd_hda_codec_generic hdaudioC0D0:    mono: mono_out=0x0
[    3.026852] snd_hda_codec_generic hdaudioC0D0:    inputs:
[    3.026853] snd_hda_codec_generic hdaudioC0D0:      Line=0x5
[    3.519263] 8139cp 0000:00:03.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
[   24.595810] random: crng init done
[   24.595814] random: 7 urandom warning(s) missed due to ratelimiting
[   36.522677] bcachefs (5af009ee-0e96-4ac4-94bb-aef4298c397c): recovering from clean shutdown, journal seq 13
[   36.528531] bcachefs (5af009ee-0e96-4ac4-94bb-aef4298c397c): going read-write
[   36.530281] bcachefs (5af009ee-0e96-4ac4-94bb-aef4298c397c): mounted with opts: metadata_replicas=2,data_replicas=2,metadata_checksum=crc64,data_checksum=crc64,compression=zstd,foreground_target=nvme,background_target=ssd,promote_target=nvme,erasure_code,noinodes_use_key_cache
root@bcachefstest:~# umount /mnt
root@bcachefstest:~# dmesg | tail -2
[   36.530281] bcachefs (5af009ee-0e96-4ac4-94bb-aef4298c397c): mounted with opts: metadata_replicas=2,data_replicas=2,metadata_checksum=crc64,data_checksum=crc64,compression=zstd,foreground_target=nvme,background_target=ssd,promote_target=nvme,erasure_code,noinodes_use_key_cache
[   53.261967] bcachefs (5af009ee-0e96-4ac4-94bb-aef4298c397c): error creating stripe: error writing data buckets
root@bcachefstest:~# fsck.bcachefs /dev/vd[bcdef]1
recovering from clean shutdown, journal seq 19
journal read done, 0 keys in 1 entries, seq 20
checking allocations
bucket 0:31 gen 0 data type none has wrong data_type: got 6, should be 0: fix? (y,n) y
bucket 0:31 gen 0 data type none has wrong dirty_sectors: got 256, should be 0: fix? (y,n) y
dev 0 has wrong parity buckets: got 1, should be 0: fix? (y,n) y
dev 0 has wrong parity sectors: got 256, should be 0: fix? (y,n) y
checking need_discard and freespace btrees
going read-write
journal replay done
starting fsck
mounted with opts: metadata_replicas=2,data_replicas=2,metadata_checksum=crc64,data_checksum=crc64,compression=zstd,foreground_target=nvme,background_target=ssd,promote_target=nvme,erasure_code,noinodes_use_key_cache,degraded,fsck,fix_errors
5af009ee-0e96-4ac4-94bb-aef4298c397c: errors fixed
root@bcachefstest:~# fsck.bcachefs /dev/vd[bcdef]1
recovering from clean shutdown, journal seq 22
journal read done, 0 keys in 1 entries, seq 23
checking allocations
bucket 0:31 data type user stale dirty ptr: 0 < 1
while marking u64s 11 type stripe 0:1:0 len 0 ver 0: algo 0 sectors 256 blocks 1:1 csum 5 gran 128 1:7936:256 0:7936:0: fix? (y,n) y
bcachefs: libbcachefs/extents.h:562: bch2_bkey_ptr_data_type: Assertion `!(ptr < s.v->ptrs || ptr >= s.v->ptrs + s.v->nr_blocks)' failed.
Aborted

leuler avatar Mar 14 '22 20:03 leuler

EC is in a known bad state at the moment, Kent wants to get around too it "soon".

YellowOnion avatar Mar 15 '22 07:03 YellowOnion