thinkfan icon indicating copy to clipboard operation
thinkfan copied to clipboard

run systemd service with realtime priority

Open bymoz089 opened this issue 3 years ago • 3 comments

Run systemd service with realtime priority, to ensure that thinkfan's execution is not delayed in case of sudden rise of CPU load. Deactivated fans on heavy CPU load could otherwise lead to a system freeze.

bymoz089 avatar Sep 20 '22 22:09 bymoz089

Hi @bymoz089, thanks for this interesting contribution. I understand the theoretical reasoning behind it, but realtime scheduling is potentially dangerous, so I have some questions:

  1. Have you actually observed any situations where this was necessary, i.e. where thinkfan was not getting enough CPU time to do its job? Because it needs very little, and even on overloaded systems (i.e. where load average > cores) I've never observed a situation where fan control was negatively affected.
  2. If we're using FIFO scheduling discipline, we have to be careful not to lock up an entire core (or the whole system in the case of single-core) if some bug causes an endless loop. So shouldn't the process be given a limited CPU time budget when we do FIFO?

vmatare avatar Sep 21 '22 19:09 vmatare

Hi @vmatare, thanks for considering the PR.

  1. yes. I made some programming mistakes (with threads and loops) resulting in running hot all CPU cores very fast, with thinkfan running and fan not rotating. This resulted in a whole system freeze (two times), fan was not starting. No hardware broke. My guess was, that it froze because of bad cooling. I then experimented with scheduler priorities. I came up with this solution about 6 months ago and had no issues since then. All this on a 10 year old thinkpad (intel core-i).

  2. Thats correct. I choosed a low priority of 20 because of that. 99 is highest priority for fifo. Scheduling is done preemptive, so the lower priority thread will be stalled. An additional measure would be to use RoundRobin (RR) policy, which gives every thread a limited time period for running, before it is rescheduled. - - - - Even tough, I did not test it, in such a situation it should still be possible to kill thinkfan.

I would argument, that it is more important the fan runs (in order to prevent hardware overheating, even if the system freezes) than preventing near-lock-ups because of an unlikely bad-thinkfan-loop on production systems.

Based on this infos: https://man7.org/linux/man-pages/man7/sched.7.html

bymoz089 avatar Sep 21 '22 21:09 bymoz089

In case you reject this PR, maybe it is something for the documentation.
It is possible, that the system admin defines a systemd service override, where this realtime scheduling gets enabled.

bymoz089 avatar Sep 21 '22 21:09 bymoz089