onlyjob
onlyjob
2: HDD is fine, no physical or structural errors whatsoever. I've fsck'd the ext4 file system. ``` chunk_readcrc: file:/mnt/02/ext4//AB/chunk_000000000348EF1C_00000001.mfs - wrong header hdd_io_begin: file:/mnt/02/ext4//AB/chunk_000000000348EF1C_00000001.mfs - read error: Success (errno=0) ```...
Just to reaffirm that side effect 3 is perfectly reproducible for me. Just today I've rebooted machine with archival chunkserver which storage class have some undergoal chunks. Chunkserver have around...
Thanks. ``` # hexdump -C -n 20 /mnt/02/ext4/AB/chunk_000000000348EF1C_00000001.mfs 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000014 ```
Interesting question is why corrupted headers manifested from the moment since HDD was moved. First I've investigated possibility of hardware problem and found nothing. Time correlation suggests that the problem...
I'd like to add more info (comment) on "wrong header" situation. Two chunkservers started to exhibit multiple "wrong header" errors on all local disks from the moment they received a...
Decommissioned (healthy) chunkserver was upgraded 3.0.104 --> 3.0.105 (--> 3.0.107 maybe) --> 3.0.109. Last upgrade was some days before decommissioning. Chunkserver was stopped once its HDDs were absorbed and manually...
Thank for trying to replicate the problem, @chogata. I have little to add... Two archival chunkservers have around 30 million chunks each (two replicas). Number of duplicate files is around...
Slow scanning is reproducible and definitely related to duplicate chunks. On chunkserver with 30 million chunks adding another HDD mostly with duplicates scans extremely slow. Because chunkserver only counts non-duplicate...
Yes but how much slower? 3% of scanning completed in 23 hours. While HDD was disconnected I spread chunks from overloaded directories so that is not the major factor in...
If all the reported problems were not enough, here is another one: an hour 168 has come and finally chunkserver started to remove duplicated chunks. _It went offline without activating...