terraform-aws-eks icon indicating copy to clipboard operation
terraform-aws-eks copied to clipboard

Bottlerocket - SelfManaged NodeGroup - extra parameter issue

Open adrianmiron opened this issue 1 year ago β€’ 1 comments

Hi,

I am facing a weird issue when trying to add the following parameters to bottlerocket via bootstrap_extra_args .

This works fine :

      bootstrap_extra_args = <<-EOT
        [settings.host-containers.admin]
        enabled = true

        [settings.host-containers.control]
        enabled = true

        [settings.kubernetes.node-labels]
        "nodegroup" = "stable"
        "eks-cluster-name" = "tf-eks-devops-ontario"

        [settings.kubernetes.kube-reserved]
        cpu = "100m"
        memory = "300Mi"
        ephemeral-storage = "1Gi"

        [settings.kubernetes.system-reserved]
        cpu = "100m"
        ephemeral-storage = "1Gi"
        memory = "100Mi"

        [settings.kubernetes.eviction-hard]
        "memory.available" = "200Mi"
        "nodefs.available" = "5%"

      EOT

Adding the last argument block, as seen in the feature PR ( https://github.com/bottlerocket-os/bottlerocket/pull/2930 ) , causes kubelet to fail, the node does not join the cluster, SSM does not start so i can test the issue....

     bootstrap_extra_args = <<-EOT
        [settings.host-containers.admin]
        enabled = true

        [settings.host-containers.control]
        enabled = true

        [settings.kubernetes.node-labels]
        "nodegroup" = "stable"
        "eks-cluster-name" = "tf-eks-devops-ontario"

        [settings.kubernetes.kube-reserved]
        cpu = "100m"
        memory = "300Mi"
        ephemeral-storage = "1Gi"

        [settings.kubernetes.system-reserved]
        cpu = "100m"
        ephemeral-storage = "1Gi"
        memory = "100Mi"

        [settings.kubernetes.eviction-hard]
        "memory.available" = "200Mi"
        "nodefs.available" = "5%"

        [settings.kubernetes]
        "shutdown-grace-period" = "60s"

      EOT

If i start the node without that argument and do apiclient set settings.kubernetes.shutdown-grace-period=60s it acccepts the command .

This is only for selfmanaged nodegroups, Karpenter managed nodes with this setting work fine.

Anyone else seen this ? I have no clue what crazy magic is causing this ....

adrianmiron avatar Jul 12 '24 13:07 adrianmiron

Ok, i got it.... It was because of the LaunchTemplate which adds some default fields

[settings.kubernetes]
"cluster-name" = "****"
"api-server" = "*********eks.amazonaws.com"
"cluster-certificate" = "*******"
"cluster-dns-ip" = ["10.100.0.10"]

So adding another [settings.kubernetes] block lower in the template caused it to fail; this worked though :

     bootstrap_extra_args = <<-EOT
        "shutdown-grace-period" = "60s"
        "shutdown-grace-period-for-critical-pods" = "30s"

        [settings.host-containers.admin]
        enabled = true

        [settings.host-containers.control]
        enabled = true

        [settings.kubernetes.node-labels]
        "nodegroup" = "stable"
        "eks-cluster-name" = "tf-eks-devops-ontario"

        [settings.kubernetes.kube-reserved]
        cpu = "100m"
        memory = "300Mi"
        ephemeral-storage = "1Gi"

        [settings.kubernetes.system-reserved]
        cpu = "100m"
        ephemeral-storage = "1Gi"
        memory = "100Mi"

        [settings.kubernetes.eviction-hard]
        "memory.available" = "200Mi"
        "nodefs.available" = "5%"

      EOT

adrianmiron avatar Jul 12 '24 14:07 adrianmiron

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

github-actions[bot] avatar Aug 12 '24 00:08 github-actions[bot]

This issue was automatically closed because of stale in 10 days

github-actions[bot] avatar Aug 23 '24 00:08 github-actions[bot]

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Sep 22 '24 02:09 github-actions[bot]