pyinfra
pyinfra copied to clipboard
server.reboot: Never detect remote system is rebooted
Describe the bug
When using server.reboot
, pyinfra reboots the target but then waits for it to come back until the timeout, even if the system comes up earlier.
To Reproduce
from pyinfra.operations server
server.reboot(name="Reboot")
pyinfra -vvv --debug 192.168.7.2 --ssh-user root reboot.py
Result:
Traceback (most recent call last):
File "/home/matthijs/.local/pipx/venvs/pyinfra/lib/python3.11/site-packages/pyinfra/api/operations.py", line 94, in _run_host_op
status = command.execute(state, host, connector_arguments)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matthijs/.local/pipx/venvs/pyinfra/lib/python3.11/site-packages/pyinfra/api/command.py", line 224, in execute
return self.function(state, host, *self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matthijs/.local/pipx/venvs/pyinfra/lib/python3.11/site-packages/pyinfra/operations/server.py", line 101, in wait_and_reconnect
raise Exception(
Exception: Server did not reboot in time (reboot_timeout=300s)
[192.168.7.2] Unexpected error in Python callback: Exception('Server did not reboot in time (reboot_timeout=300s)',)
Expected behavior
The server should reboot, when it is rebooted pyinfra should continue (or exit with success in this example).
Analysis
I added some debug output, and it turns out that Host.connect
is called repeatedly to try making a new connection, but host.connected
is true, so no actual connection attempts are made: https://github.com/pyinfra-dev/pyinfra/blob/80aca6e3ea9e2c1e423505abf1f5ef9c2c4affdc/pyinfra/api/host.py#L365
Looking at the code, there is no way to make host.connected
False again (except for creating a new Host object). So I wonder:
- If connectors (SSH in particular) have something in place to detect a disconnection
- If
server.reboot
should be callinghost.disconnect()
to explicitly terminate the connection (because the TCP connection might otherwise linger and take some time to be detected as failed). - Maybe
host.disconnect
should setconnected=False
? -
server.reboot
already setshost.connection
to None, and then uses that to check whether the connection was succesful. Is this proper use of the host API, or isserver.reboot
messing with internals? Shouldserver.reboot
even check for a succesfulconnection after the fact, or should it passraise_exceptions
and then detect connection success by the absence of an exception?
(and an unrelated observation: It seems the timeout is not properly observed, since currently the timeout is divided by the interval to get the number of retries, but this assumes connection attempts take zero time, which is not true, especially when a system is rebooting, they might take longer).
Meta
pyinfra --support
If you are having issues with pyinfra or wish to make feature requests, please
check out the GitHub issues at https://github.com/Fizzadar/pyinfra/issues .
When adding an issue, be sure to include the following:
System: Linux
Platform: Linux-6.5.0-28-generic-x86_64-with-glibc2.38
Release: 6.5.0-28-generic
Machine: x86_64
pyinfra: v3.0b0
Executable: /home/matthijs/.local/bin/pyinfra
Python: 3.11.6 (CPython, GCC 13.2.0)
Installed via pipx.