aws-node-termination-handler icon indicating copy to clipboard operation
aws-node-termination-handler copied to clipboard

`securityContext` incorrect when deployed to a Windows node

Open benpettman opened this issue 3 years ago • 6 comments

Describe the bug We are deploying the termination handler on windows spot using the example included in the repo

Steps to reproduce Build an EKS cluster, create a windows nodegroup, deploy the termination handler using helm

Expected outcome I expected the daemon set to deploy and run as the linux one does

Actual outcome Pod errors with the following:

MountVolume.SetUp failed for volume "kube-api-access-nbmft" : chown c:\var\lib\kubelet\pods\1780462d-caeb-4b6d-8425-4d8bc0585ee7\volumes\kubernetes.io~projected\kube-api-access-nbmft..2022_02_24_14_33_26.442943168\token: not supported by windows

benpettman avatar Feb 24 '22 14:02 benpettman

https://github.com/kubernetes/kubernetes/issues/102849

Further investigations lead to that the template for windows has the runAsUser set on, removing this causes the pod to land and run perfectly

benpettman avatar Feb 28 '22 11:02 benpettman

Glad you were able to resolve your issue! Out of curiosity, which version of NTH are you running? Are you using IMDS mode?

snay2 avatar Mar 01 '22 21:03 snay2

Version is 1.15.0, but tried it with others. And yes IMDS mode

benpettman avatar Mar 01 '22 21:03 benpettman

I have just run into this issue on the latest version of the chart where deploying for both linux and windows. The chart should be better at handling this situation and not simply copy the securityContext directly from values for both nodes. They need to be different depending on the target Daemonset. Please consider reopening this issue for a proper fix.

pmcenery-bl avatar Jul 08 '22 14:07 pmcenery-bl

@pmcenery-bl Good idea. Can you share the configuration you used for Windows and make a recommendation about how the chart should be different from the linux version?

snay2 avatar Jul 20 '22 22:07 snay2

@pmcenery-bl Good idea. Can you share the configuration you used for Windows and make a recommendation about how the chart should be different from the linux version?

@snay2 Thanks for re-opening. For a mixed Linux and Windows cluster I have deployed by using the chart as follows:

Create a manifest as follows:

helm fetch \
--repo https://aws.github.io/eks-charts \
--version 0.18.5 \
--untar --untardir charts \
aws-node-termination-handler

helm template aws-node-termination-handler charts/aws-node-termination-handler \
--namespace aws-node-handler \
--values charts/aws-node-termination-handler/values.yaml \
--create-namespace \
--set targetNodeOs="linux windows" \
> aws-node-termination-handler.yaml

rm -rf charts

Modify with kustomize as follows:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

[...]

patches:
- target:
    kind: DaemonSet
    name: aws-node-termination-handler-win
  patch: |-
    - op: remove
      path: /spec/template/spec/containers/0/securityContext/runAsUser

Essentially, Windows doesn't understand the runAsUser, and looking at the template for this here:

https://github.com/aws/aws-node-termination-handler/blob/main/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml#L55-L58

The values.securityContext map looks to be copied verbatim to securityContext in the daemonset. This appears to be identical for both Windows and Linux. But would need to be slightly different for Windows to exclude runAsUser which is not understood.

If I can come up with a solution I'll submit a PR. Maybe someone else has a better idea on how to handle this?

Many thanks, Paul.

pmcenery-bl avatar Jul 26 '22 11:07 pmcenery-bl

This has been released in NTH app version v1.17.0. Helm chart version v0.19.0.

snay2 avatar Aug 18 '22 16:08 snay2