aws-node-termination-handler
aws-node-termination-handler copied to clipboard
`securityContext` incorrect when deployed to a Windows node
Describe the bug We are deploying the termination handler on windows spot using the example included in the repo
Steps to reproduce Build an EKS cluster, create a windows nodegroup, deploy the termination handler using helm
Expected outcome I expected the daemon set to deploy and run as the linux one does
Actual outcome Pod errors with the following:
MountVolume.SetUp failed for volume "kube-api-access-nbmft" : chown c:\var\lib\kubelet\pods\1780462d-caeb-4b6d-8425-4d8bc0585ee7\volumes\kubernetes.io~projected\kube-api-access-nbmft..2022_02_24_14_33_26.442943168\token: not supported by windows
https://github.com/kubernetes/kubernetes/issues/102849
Further investigations lead to that the template for windows has the runAsUser set on, removing this causes the pod to land and run perfectly
Glad you were able to resolve your issue! Out of curiosity, which version of NTH are you running? Are you using IMDS mode?
Version is 1.15.0, but tried it with others. And yes IMDS mode
I have just run into this issue on the latest version of the chart where deploying for both linux and windows. The chart should be better at handling this situation and not simply copy the securityContext directly from values for both nodes. They need to be different depending on the target Daemonset. Please consider reopening this issue for a proper fix.
@pmcenery-bl Good idea. Can you share the configuration you used for Windows and make a recommendation about how the chart should be different from the linux version?
@pmcenery-bl Good idea. Can you share the configuration you used for Windows and make a recommendation about how the chart should be different from the linux version?
@snay2 Thanks for re-opening. For a mixed Linux and Windows cluster I have deployed by using the chart as follows:
Create a manifest as follows:
helm fetch \
--repo https://aws.github.io/eks-charts \
--version 0.18.5 \
--untar --untardir charts \
aws-node-termination-handler
helm template aws-node-termination-handler charts/aws-node-termination-handler \
--namespace aws-node-handler \
--values charts/aws-node-termination-handler/values.yaml \
--create-namespace \
--set targetNodeOs="linux windows" \
> aws-node-termination-handler.yaml
rm -rf charts
Modify with kustomize as follows:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
[...]
patches:
- target:
kind: DaemonSet
name: aws-node-termination-handler-win
patch: |-
- op: remove
path: /spec/template/spec/containers/0/securityContext/runAsUser
Essentially, Windows doesn't understand the runAsUser, and looking at the template for this here:
https://github.com/aws/aws-node-termination-handler/blob/main/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml#L55-L58
The values.securityContext map looks to be copied verbatim to securityContext in the daemonset. This appears to be identical for both Windows and Linux. But would need to be slightly different for Windows to exclude runAsUser which is not understood.
If I can come up with a solution I'll submit a PR. Maybe someone else has a better idea on how to handle this?
Many thanks, Paul.