aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

ALS CLB blocks hostPort usage of later installed Pods

Open david-freistrom opened this issue 2 years ago • 5 comments

Describe the bug AWS Loadbalancer Controller blocks other deployments to use hostPort. Pods with hostNetwork: true and a hostPort set to 10261 or 10271 or any other in between for webhooks cannot be used. If I delete AWS LBC the other Pods becaomes ready.

Steps to reproduce

Install Helm of AWS LBC with the follworing values:

serviceAccount:
  create: true
  name: aws-lbc
  annotations:
    "eks.amazonaws.com/role-arn" : "arn:aws:iam::12345:role/aws-lbs-irsa"
securityContext:
  capabilities:
    drop:
    - ALL
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  allowPrivilegeEscalation: false
resources:
  limits:
    cpu: 100m
    memory: 128Mi
  requests:
    cpu: 100m
    memory: 128Mi
clusterName: test
region: eu-central-1
vpcId: vpc-123456778
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
serviceMonitor:
  enabled: true
  namespace: monitoring
enableServiceMutatorWebhook: false
ingressClassConfig:
  default: true
disableRestrictedSecurityGroupRules: true

Then install ExternalSecrets Operator wity the values:

concurrent: 5
serviceAccount:
  name: external-secrets
  create: true
  annotations: 
    "eks.amazonaws.com/role-arn" : "arn:aws:iam::12345:role/external-secrets-irsa"
resources:
  requests:
    cpu: 128m
    memory: 32Mi
  limits:
    cpu: 512m
    memory: 128Mi
serviceMonitor:
  enabled: true
webhook:
  **hostNetwork: true**
  **port: 10261**
  resources:
    requests:
      cpu: 128m
      memory: 32Mi
    limits:
      cpu: 512m
      memory: 128Mi
certController:
  resources:
    requests:
      cpu: 128m
      memory: 32Mi
    limits:
      cpu: 512m
      memory: 128Mi

Expected outcome

ExternalSecrets Webhook Pod becmes ready

Environment

EKS 1.27 OS: linux (amd64) OS Image: Bottlerocket OS 1.14.3 (aws-k8s-1.27) Kernel version: 5.15.117 Container runtime: containerd://1.6.20+bottlerocket Kubelet version: v1.27.3-eks-6f07bbc

  • AWS Load Balancer controller version

Helm version: "1.6.0" Controller: v2.6

  • Kubernetes version YES EKS 1.27

Additional Context: When I install ExternalSecrets first and after AWS LBC installation restart the webhook pod from external-secrets it's all ready.

david-freistrom avatar Sep 13 '23 09:09 david-freistrom

@david-freistrom, can you please set the hostNetwork: false?

oliviassss avatar Sep 13 '23 22:09 oliviassss

I gave it a try on the AWS LBC because I disabled the mutator-webhook. Actually I use Cilium CNI without ENI or Chaining. And in the values.yaml comments it is saying I have to set it to true. So I cant set it to false for External-Secrets without loosing the reachability from EKS Control Plane.

Thats works so far for my use case. But I did not understand why AWS LBC is blocking a range of host ports.

david-freistrom avatar Sep 14 '23 11:09 david-freistrom

Now I get instead:

Failed deploy model due to Internal error occurred: failed calling webhook "mtargetgroupbinding.elbv2.k8s.aws": failed to call webhook: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding?timeout=10s": Address is not allowed

I was hoping i will not face that issue. So I need to switch back hostNetwork to true.

And get again for external-secrets webhook:

0/2 nodes are available: 2 node(s) didn't have free ports for the requested pod ports. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod..

david-freistrom avatar Sep 14 '23 14:09 david-freistrom

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 28 '24 09:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 27 '24 10:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 28 '24 11:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 28 '24 11:03 k8s-ci-robot