aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
Helm update to version 1.7.x resulting in pods Readiness probe failed: HTTP probe failed with statuscode: 404
Helm update to version 1.7.x resulting in Pod: aws-load-balancer-controller Readiness probe failed: HTTP probe failed with statuscode: 404
helm versions 1.4.1 through 1.6.2 working as expected
1.7.x pod logs:
{"level":"info","ts":1708126775.380832,"msg":"version","GitVersion":"v2.4.6","GitCommit":"a92e689dfe464f5b24784f398947e0fef31dc470","BuildDate":"2023-01-12T06:29:16+0000"}
{"level":"info","ts":1708126775.410416,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1708126775.4145405,"logger":"setup","msg":"adding health check for controller"}
{"level":"info","ts":1708126775.414611,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-v1-pod"}
{"level":"info","ts":1708126775.4146783,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1708126775.414728,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1708126775.4147651,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1-ingress"}
{"level":"info","ts":1708126775.4148157,"logger":"setup","msg":"starting podInfo repo"}
{"level":"info","ts":1708126777.415071,"msg":"starting metrics server","path":"/metrics"}
I0216 23:39:37.415030 1 leaderelection.go:243] attempting to acquire leader lease kube-system/aws-load-balancer-controller-leader...
{"level":"info","ts":1708126777.415177,"logger":"controller-runtime.webhook.webhooks","msg":"starting webhook server"}
{"level":"info","ts":1708126777.4158258,"logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":1708126777.415922,"logger":"controller-runtime.webhook","msg":"serving webhook server","host":"","port":9443}
{"level":"info","ts":1708126777.4159951,"logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
+1 started seeing this today
This is known issue. Can you confirm that you are using latest version for controller also. If you are updating chart version, you need also to use v2.7.0 controller version
Helm chart reporting App Version v 2.7.1 Also, in the helm chart 1.7.1 it set to public.ecr.aws/eks/aws-load-balancer-controller:v2.4.6 whatever version is in that container the one is running in chart 1.7.1...
image:
pullPolicy: IfNotPresent
repository: public.ecr.aws/eks/aws-load-balancer-controller
tag: v2.4.6
https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml. Here is the most recent helm chart, which links to the right controller version. Can you tell us from where did you see the v2.4.6.
https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml. Here is the most recent helm chart, which links to the right controller version. Can you tell us from where did you see the v2.4.6.
Manual image tagging in the published helm chart resolving the issue.
I am using Lens to update helm chart deployments. For some reason Lens is thinking that Image tag is user supplied value and keeping it unchanged.
additionalLabels: {}
affinity: {}
autoscaling:
enabled: false
maxReplicas: 5
minReplicas: 1
targetCPUUtilizationPercentage: 80
awsApiEndpoints: null
awsApiThrottle: null
awsMaxRetries: null
backendSecurityGroup: null
cluster:
dnsDomain: cluster.local
clusterName: production
clusterSecretsPermissions:
allowAllSecrets: false
configureDefaultAffinity: true
controllerConfig:
featureGates: {}
createIngressClassResource: true
defaultSSLPolicy: null
defaultTags: {}
defaultTargetType: instance
deploymentAnnotations: {}
disableIngressClassAnnotation: null
disableIngressGroupNameAnnotation: null
disableRestrictedSecurityGroupRules: null
dnsPolicy: null
enableBackendSecurityGroup: null
enableCertManager: false
enableEndpointSlices: null
enablePodReadinessGateInject: null
enableServiceMutatorWebhook: true
enableShield: null
enableWaf: null
enableWafv2: null
env: null
externalManagedTags: []
extraVolumeMounts: null
extraVolumes: null
fullnameOverride: ""
hostNetwork: false
image:
pullPolicy: IfNotPresent
repository: public.ecr.aws/eks/aws-load-balancer-controller
tag: v2.4.6
imagePullSecrets: []
ingressClass: alb
ingressClassConfig:
default: false
ingressClassParams:
create: true
name: null
spec: {}
ingressMaxConcurrentReconciles: null
keepTLSSecret: true
livenessProbe:
failureThreshold: 2
httpGet:
path: /healthz
port: 61779
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 10
logLevel: null
metricsBindAddr: ""
nameOverride: ""
nodeSelector: {}
objectSelector:
matchExpressions: null
matchLabels: null
podAnnotations: {}
podDisruptionBudget: {}
podLabels: {}
podSecurityContext:
fsGroup: 65534
priorityClassName: system-cluster-critical
rbac:
create: true
readinessProbe:
failureThreshold: 2
httpGet:
path: /readyz
port: 61779
scheme: HTTP
initialDelaySeconds: 10
successThreshold: 1
timeoutSeconds: 10
region: ca-central-1
replicaCount: 2
resources: {}
revisionHistoryLimit: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
serviceAccount:
annotations: {}
automountServiceAccountToken: true
create: false
imagePullSecrets: null
name: aws-load-balancer-controller
serviceAnnotations: {}
serviceMaxConcurrentReconciles: null
serviceMonitor:
additionalLabels: {}
enabled: false
interval: 1m
namespace: null
serviceTargetENISGTags: null
syncPeriod: null
targetgroupbindingMaxConcurrentReconciles: null
targetgroupbindingMaxExponentialBackoffDelay: null
terminationGracePeriodSeconds: 10
tolerations: []
topologySpreadConstraints: {}
updateStrategy: {}
vpcId: ""
watchNamespace: null
webhookBindPort: null
webhookNamespaceSelectors: null
webhookTLS:
caCert: null
cert: null
key: null
For this time, I changed the readiness probe health check back to "healthz", and its worked properly.
@davidkl97 Please check whether it is an issue or if any workaround needs to be done here.
Hey, @ravikyada which image version you are using? please note that readiness probe change requires newer images as suggested here due to the the fact that readyz endpoint registered in newer version(image version 2.7.0)
ok thank you @davidkl97 for the help!!!