mc icon indicating copy to clipboard operation
mc copied to clipboard

mc admin heal doesn't work

Open AlexZIX opened this issue 1 year ago • 8 comments

I've replaced one broken disk with the new one and its filling with data. In previous versions of MinIO I can reviewed the healing progress using mc admin heal but for now it shows me that no active healing in my cluster:

root@minio-cold-1:~# mc admin heal minio-cold No active healing is detected for new disks.

But at the same time I see in my Grafana that healing are in progress:

image

So is this a bug or new version shouldn't show the healing status in console?

mc --version

root@minio-cold-1:~# mc --version mc version RELEASE.2023-01-28T20-29-38Z (commit-id=2e95a70c98fb9c2629cd89817b8759bfa109a4d0) Runtime: go1.19.4 linux/amd64

System information

Cluster: 4 nodes with 4 disks on each

root@minio-cold-1:~# uname -a Linux minio-cold-1 5.15.0-69-generic #76-Ubuntu SMP Fri Mar 17 17:19:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

AlexZIX avatar Feb 26 '24 08:02 AlexZIX

So is this a bug or new version shouldn't show the healing status in console?

This is because healing.bin is missing for some reason, causing the healing state to be removed.

// cc @vadmeste this sounds something we have seen now elsewhere, can you investigate?

harshavardhana avatar Feb 26 '24 08:02 harshavardhana

@AlexZIX newer versions do not show stats in prometheus anymore. Are you sure the disk healing did not finish ? can you check the disk usage (df -h) and compare it with other disks in the same erasure set ?

vadmeste avatar Feb 26 '24 08:02 vadmeste

@vadmeste Healing should be in progress because replaced disk is still have only 10% of data:

image

One more question is why healing process too slow? I've replaced this disk week ago but it contains only 10% of data. If healing continues at the same speed then total recovering time will be 10 weeks or more that 2 months. Is that normal?

AlexZIX avatar Feb 26 '24 11:02 AlexZIX

can you share all MinIO logs of node minio-cold-4 ?

vadmeste avatar Feb 26 '24 11:02 vadmeste

Yes if you'll explain where I can find it r how to export it.

AlexZIX avatar Feb 26 '24 13:02 AlexZIX

@AlexZIX it depends how you deployed MinIO. It is MinIO standard output. If it is bare-metal, most likley, journatlctl -u minio will show some logs. By the way are you using ILM expiry feature in this cluster ?

vadmeste avatar Feb 26 '24 19:02 vadmeste

@vadmeste Output from journalctl attached. minio.log

If ILM means expiration of versioned files which was removed then my answer is yes - we use buckets with versioning enabled with expiration settings from removed objects.

This is df -h output which may helps too:

root@minio-cold-4:~# df -h Filesystem Size Used Avail Use% Mounted on tmpfs 1.6G 1.8M 1.6G 1% /run /dev/mapper/ubuntu--vg-ubuntu--lv 17G 5.6G 11G 35% / tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock /dev/sda2 1.8G 252M 1.4G 16% /boot /dev/sda1 952M 6.1M 946M 1% /boot/efi hdd-pool-1 3.2T 157G 3.1T 5% /hdd-pools/hdd-pool-1 hdd-pool-4 3.2T 1.3T 1.9T 41% /hdd-pools/hdd-pool-4 hdd-pool-3 3.2T 1.4T 1.9T 42% /hdd-pools/hdd-pool-3 hdd-pool-2 3.2T 1.4T 1.9T 42% /hdd-pools/hdd-pool-2 tmpfs 1.6G 4.0K 1.6G 1% /run/user/0

AlexZIX avatar Feb 26 '24 20:02 AlexZIX