eliminate use of `systemctl try-restart`
Today we use systemctl try-restart to attempt a service restart after applying settings. Partly this is because we process settings early in the boot, when the affected services haven't been started yet and aren't intended to start.
However, this causes trouble when changing settings at runtime, because if the service isn't running, the command will do nothing.
Services might not be running for a few reasons:
- they failed to start after bad settings were previously applied
- they are starting after new settings are applied, but aren't yet started all the way
In a host container running at boot, @vignesh-goutham discovered the following race:
- host container queries systemd for the status of
kubelet - waits for it to finish activating (
ActiveState=activeandSubState=running) - issues
apiclient setcommands to reconfigurekubelet apiserverexecutes restart commandssystemctl try-restartdoes nothingsystemdlogs the firstStarted Kubeletaround 1 second later
From this we can infer that two calls to apiclient set kubernetes.<blah> in quick succession will not always result in two kubelet restarts, leaving that service in an undefined state.
For changing settings at runtime, we really need something more like force-stop and force-start to ensure that the restart commands are fully enacted for each transaction.
@bcressey is this still needed? If so, any ideas for next steps?
This is still needed, though it's hard to assess the urgency. It's really more of a correctness concern: the promise of the settings API is that the underlying system state will be reconciled when the settings are applied, but systemctl try-restart is not sufficient to achieve this.
Next steps would be auditing existing uses of systemctl try-restart, and coming up with a set of systemctl commands that provides the desired guarantees which can be used to replace those existing uses.
Looks like at least 8 references as of today: https://github.com/bottlerocket-os/bottlerocket/search?q=try-restart
I'll leave this open and put it in the icebox until it can be prioritized and have someone work on it.
As some additional context from a user; this seems to be the root cause of some configuration issues we've been running into.
The short-version is we have a private registry and need to set credentials. We do this today by using the apiclient set command -- but this sometimes doesn't actually apply the config change as configured. From digging into this issue what appears to be happening is the following:
apiclient setis called- apiserver writes the updated config and calls
systemctl try-restart containerd try-restartfails; and config is "in" apiclient but not active on the underlying containerd
This effectively "breaks" the apiclient set API as it isn't "transactional" -- so any call to set may or may not actually apply the config (since in this situation the new value is present in apiclient get even though it isn't applied). This combined with the fact that the API has no mechanism to know whether the setting has taken effect makes it impossible to reliably set a lot of configuration options through apiclient.
As I see it there seem to only be two options here:
- Remove the use of
try-restart-- which would (presumably?) mean that theapiclient setwould remain "transactional" - Provide some insight into underlying service status through apiclient -- this way my daemonset could monitor the restart or failure of restart and re-trigger the
apiclient set
If this is better suited in another issue I'll gladly create a fresh issue for this; but this seems like its the same issue.