nut icon indicating copy to clipboard operation
nut copied to clipboard

Support sleep/hibernate integration (generally and with systemd in particular)

Open jimklimov opened this issue 2 years ago • 1 comments

There were several discussions in the past about NUT daemons emitting messages like "Data stale" when a system wakes up after prolonged sleep. One particular use-case is putting a system into hibernation when an UPS goes depleted, or into sleep when on-battery (hoping the lone RAM's power draw can be kept up for hours). This is not a platform-specific issue, everybody might do it, but it may have platform-specific optimal solutions (and/or general ones).

The messages make sense technically (lots of time passed since last poll) but seem confusing to some users and pollute the console cluttered by various subsystems after wake-up.

This issue is about exploring ways to integrate with service management frameworks like systemd to "know" a sleep happened, e.g. deal with dependencies on/against https://www.freedesktop.org/software/systemd/man/systemd-suspend.service.html family of units, or deliver hooks for them (example in https://blog.christophersmart.com/2016/05/11/running-scripts-before-and-after-suspend-with-systemd/ but official doc warns not to - deemed local hacks), or explore its "Inhibitor interface" per https://blog.christophersmart.com/2016/05/11/running-scripts-before-and-after-suspend-with-systemd/ and possibly something is possible by listening to systemd dbus...

With such integration, NUT daemons would know to silently reestablish connections when the system wakes up, and maybe chirp a message about that specifically. One solution could be to use the pre/post-sleep hooks to completely stop and start NUT services and so avoid "staleness" (also possibly account for IP address changes after wake-up).

A general-purpose alternative could be to somehow watch for system clock jumps (e.g. we have a 5 or 30 second pollfreq, and suddenly last loop took tens of minutes or more), and/or possibly track "local" time (as opposed to monotonous kernel clock) to see and react to NTP/RTC time jumps - which along with loops taking too long can be due to sleep or winter/summer time changes. Again, the purpose would be to tailor the reaction (and noise made) to this situation.

Also useful would be some documentation notes (FAQ?) and/or contributed scripts/* for different platforms, specifically about requesting sleep/hibernation as a SHUTDOWNCMD. Notably, such a command could take care of stopping/starting units and init-script styled services, and/or leaving flag-files for daemons to know about the intentional situation. (Service starts could be scheduled by cron/at facility, e.g. roughly at now + 1min generally).

Enthusiasts welcome to take on this, especially those who can test it well (have systems that do go to sleep while monitoring)!

jimklimov avatar Jan 22 '23 09:01 jimklimov

Yeah, I've had issues with the system turning off after a sleep, when I know that the power will be out for 10-15 minutes, which makes me willing to deactivate the nut-monitor service before suspension to avoid such things. The log demonstrating the event:

2024-06-21T13:06:11+00:00 nut-monitor[1449]: UPS nutdev1@localhost on battery
2024-06-21T13:06:11+00:00 nut-monitor[143430]: Network UPS Tools upsmon 2.8.1
2024-06-21T13:16:56+00:00 nut-server[1424]: Data for UPS [nutdev1] is stale - check driver
2024-06-21T13:16:56+00:00 nut-server[1424]: UPS [nutdev1] data is no longer stale
2024-06-21T13:16:56+00:00 nut-monitor[1449]: Poll UPS [nutdev1@localhost] failed - Write error: Broken pipe
2024-06-21T13:16:56+00:00 nut-monitor[1449]: Communications with UPS nutdev1@localhost lost
2024-06-21T13:16:56+00:00 nut-monitor[1449]: UPS [nutdev1@localhost] was last known to be not fully online and currently is not communicating, assuming dead
2024-06-21T13:16:56+00:00 nut-monitor[1449]: FSD set on UPS nutdev1@localhost failed: Driver not connected
2024-06-21T13:16:56+00:00 nut-monitor[1449]: Executing automatic power-fail shutdown
2024-06-21T13:16:56+00:00 nut-monitor[1449]: Auto logout and shutdown proceeding
2024-06-21T13:16:55+00:00 upsd[1424]: Data for UPS [nutdev1] is stale - check driver
2024-06-21T13:16:56+00:00 nut-monitor[143737]: Network UPS Tools upsmon 2.8.1
2024-06-21T13:16:55+00:00 upsd[1424]: UPS [nutdev1] data is no longer stale
2024-06-21T13:16:56+00:00 nut-monitor[143742]: Network UPS Tools upsmon 2.8.1
2024-06-21T13:17:00+00:00 nut-monitor[1449]: Network UPS Tools upsmon 2.8.1
2024-06-21T13:17:02+00:00 shutdown[144102]: Failed to set wall message, ignoring: Message recipient disconnected from message bus without replying
2024-06-21T13:17:02+00:00 shutdown[144102]: Shutdown scheduled for Fri 2024-06-21 09:17:00 AST, use 'shutdown -c' to cancel.
2024-06-21T13:17:02+00:00 nut-monitor[1438]: Network UPS Tools upsmon 2.8.1
2024-06-21T13:17:02+00:00 systemd[1]: nut-monitor.service: Deactivated successfully.

braiam avatar Jun 21 '24 13:06 braiam

After experimenting a bit, I ended up with this service file here to work around the sleep issue, it stops and restarts nut.target before and after sleep:

# /etc/systemd/system/nut-sleep.service
[Unit]
Description=NUT sleep hook
Before=sleep.target
StopWhenUnneeded=yes

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/systemctl stop nut.target
ExecStop=/usr/bin/systemctl start --no-block nut.target

[Install]
WantedBy=sleep.target

Ropid avatar Aug 15 '24 21:08 Ropid

Looks quite neat, in fact!

Would you care to post it as PR for better attribution in source history?

UPDATE: Posted via PR below.

jimklimov avatar Aug 15 '24 21:08 jimklimov