raid-sleep Spindown / Spinup loop

Hello Thomas,

i'm using your tool on my tiny self-made nas. It worked very well, until i did a upgrade to bullseye. I'm not sure if only the update causes the issue, but i did not changed much.

The tool began to spinup the disks immediately after spindown. So i got a spindown/spinup loop on every timeout-cycle. I checked /proc/disktats and i could watch diskstats changing while hdparm did the spindown. I could not figure out, what service/tool did the access, maybe its my mdraid.

I build a simple workaround to prevent the spindown/spinup loop by simply adding a spinup delay to your code. Every time the tool detects a spinup event (on disktat change), it checks if the spindown was performed within the configured delaytime (defaulting to 3 secs, which seems to be enough) and skips any action in this case. It contionously get the stats, so after this delay time it will check again if disktstats changed.

I'm not sure if it is good fix, but maybe other people have the same issue and can get helped with this

regards Andreas

spinup-delay.patch.txt

Feb 08 '22 21:02 lp24db

Thanks for your patch!

Unfortunately, I don't have a system to test this on at the moment, so I will just leave this issue open for others to see.

I will come back to it when I update my server to bullseye.

regards, Thomas

Feb 09 '22 22:02 thomask77

Hi,

thanks for the patch, it's working now on my system (Debian Bullseye, backport 5.16 kernel amd64).

Andreas

May 11 '22 09:05 azw71

I think the problem is that the command for spindown changes the stats. It should be sufficient to re-read the stats directly after the spindown by inserting 2 lines after power_down():

dprint("Powering down after %s" % hms(now - last_standby)) power_down() stats = diskstats() stats = {k: v for k, v in list(stats.items()) if k in args.disk_devs}

~ Markus

Apr 22 '24 08:04 MarkusEh

The issue should have been fixed in kernel versions >= 6.0.17:

https://bugzilla.kernel.org/show_bug.cgi?id=207439
https://bugzilla.kernel.org/show_bug.cgi?id=215856

So I think we can close the bug without any changes.

Sep 18 '24 21:09 thomask77