bottlerocket eliminate use of `systemctl try-restart`

Today we use systemctl try-restart to attempt a service restart after applying settings. Partly this is because we process settings early in the boot, when the affected services haven't been started yet and aren't intended to start.

However, this causes trouble when changing settings at runtime, because if the service isn't running, the command will do nothing.

Services might not be running for a few reasons:

they failed to start after bad settings were previously applied
they are starting after new settings are applied, but aren't yet started all the way

In a host container running at boot, @vignesh-goutham discovered the following race:

host container queries systemd for the status of kubelet
waits for it to finish activating (ActiveState=active and SubState=running)
issues apiclient set commands to reconfigure kubelet
apiserver executes restart commands
systemctl try-restart does nothing
systemd logs the first Started Kubelet around 1 second later

From this we can infer that two calls to apiclient set kubernetes.<blah> in quick succession will not always result in two kubelet restarts, leaving that service in an undefined state.

For changing settings at runtime, we really need something more like force-stop and force-start to ensure that the restart commands are fully enacted for each transaction.

Aug 14 '21 00:08 bcressey

@bcressey is this still needed? If so, any ideas for next steps?

Mar 06 '23 19:03 stmcginnis

This is still needed, though it's hard to assess the urgency. It's really more of a correctness concern: the promise of the settings API is that the underlying system state will be reconciled when the settings are applied, but systemctl try-restart is not sufficient to achieve this.

Next steps would be auditing existing uses of systemctl try-restart, and coming up with a set of systemctl commands that provides the desired guarantees which can be used to replace those existing uses.

Mar 06 '23 19:03 bcressey

Looks like at least 8 references as of today: https://github.com/bottlerocket-os/bottlerocket/search?q=try-restart

I'll leave this open and put it in the icebox until it can be prioritized and have someone work on it.

Mar 06 '23 19:03 stmcginnis

As some additional context from a user; this seems to be the root cause of some configuration issues we've been running into.

The short-version is we have a private registry and need to set credentials. We do this today by using the apiclient set command -- but this sometimes doesn't actually apply the config change as configured. From digging into this issue what appears to be happening is the following:

apiclient set is called
apiserver writes the updated config and calls systemctl try-restart containerd
try-restart fails; and config is "in" apiclient but not active on the underlying containerd

This effectively "breaks" the apiclient set API as it isn't "transactional" -- so any call to set may or may not actually apply the config (since in this situation the new value is present in apiclient get even though it isn't applied). This combined with the fact that the API has no mechanism to know whether the setting has taken effect makes it impossible to reliably set a lot of configuration options through apiclient.

As I see it there seem to only be two options here:

Remove the use of try-restart -- which would (presumably?) mean that the apiclient set would remain "transactional"
Provide some insight into underlying service status through apiclient -- this way my daemonset could monitor the restart or failure of restart and re-trigger the apiclient set

If this is better suited in another issue I'll gladly create a fresh issue for this; but this seems like its the same issue.

Apr 09 '24 20:04 thomasjackson-clari