amulet
amulet copied to clipboard
sentry.wait is ineffective when clock skew exists between local machine and remote Juju units
Clock skew between local machine and remote Juju units can cause amulet.sentry.wait to take an extraordinarily long time (if the remote machine is skewed to the future).
Inversely, when the remote machine's clock is behind, sentry.wait may return immediately.
Either way, waiting for an IDLE_THRESHOLD which is calculated based on the diff of local machine time vs. remote machine time isn't always what it seems. :timer_clock:
ex:
⟫ date && juju ssh 2 date
Thu Nov 3 21:41:33 UTC 2016
Warning: Permanently added '10.5.4.106' (ECDSA) to the list of known hosts.
Warning: Permanently added '10.5.4.108' (ECDSA) to the list of known hosts.
Thu Nov 3 21:43:13 UTC 2016
Connection to 10.5.4.108 closed.
In the code where (datetime.now() - since).total_seconds()
is compared with IDLE_THRESHOLD
(30 seconds), the starting total_seconds
is a negative number, -90
seconds in this example (which is the future).
That means that Amulet will sit and spin for a full 2 minutes per Juju unit. In deployments with a dozen or two dozen units, this is quite problematic to test resources.