Flatcar icon indicating copy to clipboard operation
Flatcar copied to clipboard

systemd ratelimit being hit after upgrade

Open ryanm101 opened this issue 2 years ago • 6 comments

Upgraded flarcar from Flatcar Container Linux by Kinvolk 2512.5.0 (Oklo) 4.19.145-flatcar docker://18.6.3 to Flatcar Container Linux by Kinvolk 2905.2.3 (Oklo) 5.10.61-flatcar docker://19.3.15 as https://github.com/kinvolk/Flatcar/issues/286 solves my encryption issues, the cluster is on Kubes v1.17.17

We have two systemd files that are now failing:

  • /etc/systemd/system/kubelet-prebootstrap-stop.path
[Unit]
Description=Watches kubeconfig creation, once created it stops prebootstrap kubelet instance
ConditionPathExists=!/etc/kubernetes/kubelet.conf

[Path]
PathExists=/etc/kubernetes/kubelet.conf

[Install]
WantedBy=multi-user.target
  • /etc/systemd/system/kubelet-prebootstrap-stop.service
[Unit]
Description=Stop prebootstrap kubelet, real kubelet finished bootstraping and can run static pods now
ConditionPathExists=/etc/kubernetes/kubelet.conf
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl stop kubelet-prebootstrap.service

On 2512.5.0

systemctl status kubelet-prebootstrap-stop.path
● kubelet-prebootstrap-stop.path - Watches kubeconfig creation, once created it stops prebootstrap kubelet instance
   Loaded: loaded (/etc/systemd/system/kubelet-prebootstrap-stop.path; enabled; vendor preset: enabled)
   Active: active (waiting) since Fri 2021-08-20 16:08:18 UTC; 2 weeks 6 days ago
systemctl status kubelet-prebootstrap-stop.service
● kubelet-prebootstrap-stop.service - Stop prebootstrap kubelet, real kubelet finished bootstraping and can run static pods now
   Loaded: loaded (/etc/systemd/system/kubelet-prebootstrap-stop.service; static; vendor preset: disabled)
   Active: inactive (dead) since Fri 2021-08-20 16:08:55 UTC; 2 weeks 6 days ago
  Process: 7133 ExecStart=/usr/bin/systemctl stop kubelet-prebootstrap.service (code=exited, status=0/SUCCESS)
 Main PID: 7133 (code=exited, status=0/SUCCESS)

On 2905.2.3

systemctl status kubelet-prebootstrap-stop.path
● kubelet-prebootstrap-stop.path - Watches kubeconfig creation, once created it stops prebootstrap kubelet instance
     Loaded: loaded (/etc/systemd/system/kubelet-prebootstrap-stop.path; enabled; vendor preset: enabled)
     Active: failed (Result: unit-start-limit-hit) since Fri 2021-09-10 12:39:16 UTC; 6min ago
   Triggers: ● kubelet-prebootstrap-stop.service

Sep 10 12:35:51 XXXXX systemd[1]: Started Watches kubeconfig creation, once created it stops prebootstrap kubelet instance.
Sep 10 12:39:16 XXXXX systemd[1]: kubelet-prebootstrap-stop.path: Failed with result 'unit-start-limit-hit'.
systemctl status kubelet-prebootstrap-stop.service
● kubelet-prebootstrap-stop.service - Stop prebootstrap kubelet, real kubelet finished bootstraping and can run static pods now
     Loaded: loaded (/etc/systemd/system/kubelet-prebootstrap-stop.service; static)
     Active: failed (Result: start-limit-hit) since Fri 2021-09-10 12:39:16 UTC; 7min ago
TriggeredBy: ● kubelet-prebootstrap-stop.path
    Process: 71121 ExecStart=/usr/bin/systemctl stop kubelet-prebootstrap.service (code=exited, status=0/SUCCESS)
   Main PID: 71121 (code=exited, status=0/SUCCESS)

Sep 10 12:39:16 XXXXX systemd[1]: Starting Stop prebootstrap kubelet, real kubelet finished bootstraping and can run static pods now...
Sep 10 12:39:16 XXXXX systemd[1]: kubelet-prebootstrap-stop.service: Succeeded.
Sep 10 12:39:16 XXXXX systemd[1]: Finished Stop prebootstrap kubelet, real kubelet finished bootstraping and can run static pods now.
Sep 10 12:39:16 XXXXX systemd[1]: kubelet-prebootstrap-stop.service: Start request repeated too quickly.
Sep 10 12:39:16 XXXXX systemd[1]: kubelet-prebootstrap-stop.service: Failed with result 'start-limit-hit'.
Sep 10 12:39:16 XXXXX systemd[1]: Failed to start Stop prebootstrap kubelet, real kubelet finished bootstraping and can run static pods now.

in both cases systemctl status kubelet-prebootstrap.service

● kubelet-prebootstrap.service - Temp instance of kubelet, just so that starts static pods
     Loaded: loaded (/etc/systemd/system/kubelet-prebootstrap.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Fri 2021-09-10 12:36:17 UTC; 13min ago
    Process: 3688 ExecStart=/opt/bin/kubelet --healthz-port=0 --pod-infra-container-image=eu.gcr.io/k8s-artifacts-prod/pause:3.1 --port=8888 --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manife>
   Main PID: 3688 (code=exited, status=0/SUCCESS)

I've tried tweaking both: StartLimitBurst and StartLimitIntervalSec to no difference

PS. I'm no systemd expert, i'm more of a init person so i'm only starting to look at systemd in any real anger, so it is possible / likely i'm just misunderstanding something / plain wrong on some level.

EDIT: Just to add a datapoint, This has happened on all my 'worker' nodes but not on my 'master' nodes

ryanm101 avatar Sep 10 '21 12:09 ryanm101

@ryanm101 Commenting per your init remark as I have felt your pain. 😂

Don't have a solution, but maybe a thought or two: SystemD has targets to avoid race conditions, so that may be one thing for you.

In addition there are different types - OneShot for example, which would negate having to run a systemctl stop foo, which I also tried doing but it never worked reliably for me.

LBNL, something we have to use frequently are overrides. Maybe you can fix another unit to create the directory in Exec=.

Anyhow, can you explain what prompts you having to create the directory in the first place? That seems like a small hack and maybe there is a different way to achieve it?

till avatar Sep 12 '21 19:09 till

Honestly not 100% why it was done (previous team did this piece of code prior to leaving) so i've inherited it.

From the descriptions it looks like (again i'm not really hugely familiar with systemd) the 1st job does nothing but create some sort of watcher for that file to appear which it will once our bootstrapping is completed. that allows the 2nd job to trigger which then stops the static pods used for bootstrapping to allow our kubelet to start properly.

All i can see is this change of behaviour when we've jumped flatcar versions. I've already had to sort a systemd DNS issue due to our massive jump, so i am hoping this is a similar change. It's only frustrated by my lack of systemd knowledge.

ryanm101 avatar Sep 13 '21 09:09 ryanm101

@ryanm101 Hmm, learned something new and it's just Monday: It seems like your .service should only run once .path reaches the desired state. There's already some sort of inotify thing running. But are you sure you need to stop the .path unit at all? It doesn't seem like that is necessary.

The other thing I would look into is, .path unit may need to wait for something else. Maybe the device is not ready for the directory to be created or something? Or maybe the directory is there and it never runs? If journald doesn't provide more info, maybe add more verbose output via: https://kinvolk.io/docs/flatcar-container-linux/latest/setup/debug/reading-the-system-log/#debugging-journald

till avatar Sep 13 '21 10:09 till

@till So i never actually stop the .path it's requires the ConditionPathExists=!/etc/kubernetes/kubelet.conf to be met to run.. on the old nodes it shows something like 'condition not met' in it's eventlog.

There's already some sort of inotify thing running. But are you sure you need to stop the .path unit at all? It doesn't seem like that is necessary.

I think it is else it might never run???? i'm not sure how often it 'tries' compared to the .path I am with you though in logic i agree i dont think it should be needed given it has the same condition as the stop service.

It's not a directory it's looking for, it's a file called kubelet.conf

we dont update flarcar or kubes by 'upgrading' we just rebuild our servers from scratch in a rolling rebuild so when we 'upgrade' a server this occurs on first boot. rebooting and it works as expected (because the file exists so the criteria is not met)

I'll take a look at that extra debug stuff, thanks

ryanm101 avatar Sep 13 '21 14:09 ryanm101

Looks like you relied on some undefined behavior that is not supported anymore? I would recommend you to rewrite the units. E.g., either use a target as suggested or turn this into a single service that acts as dependency for running the static pods, modeling it so that you use systemd-notify to mark the service ready when the prebootstrap is done - a simple shell script should be able to spawn kubelet … in the background and kill it again when the file is created and issue a systemd-notify.

pothos avatar Sep 13 '21 14:09 pothos

- name: kubelet-prebootstrap.service
      enable: true
      contents: |
        [Unit]
        Description=Temp instance of kubelet, just so that starts static pods
        Wants=network-online.target prefetch-docker-images.service docker.service unpack-kubeadm-assets.service
        After=network-online.target prefetch-docker-images.service docker.service unpack-kubeadm-assets.service
        # If kubelet was already bootstrapped, then it starts normally and doesn't need this
        # apiserver-proxy - it start's permanent one as a static pod
        ConditionPathExists=!/etc/kubernetes/kubelet.conf
        [Service]
        Type=notify
        ExecStart={{bin_dir}}/kubelet --healthz-port=0 --pod-infra-container-image={{pod_infra_image}} --port=8888 --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver={{cgroup_driver}}
        [Install]
        WantedBy=multi-user.target

is the service that does the bootstrapping

I wonder if i added After= to the /path and -stop.service would that help force an order and timing.

ryanm101 avatar Sep 13 '21 14:09 ryanm101

@ryanm101 Were you able to test that out?

krishjainx avatar Jun 02 '23 00:06 krishjainx

Closing this issue. Feel free to re-open if required.

sayanchowdhury avatar Sep 08 '23 12:09 sayanchowdhury