ingress-controller
ingress-controller copied to clipboard
Support zero downtime rollout restart of target deployments.
What happened?
Doing a rollout restart of the verify service results in a small window of downtime.
What did you expect to happen?
rollout restarts of a targeted application should have zero failed requests.
How'd it happen?
$ kubectl rollout restart -n ingress deploy/verify &&
while sleep .25; do
curl -sv https://verify.example.com/healthcheck 2>&1 | grep -E '^< ';
done
deployment.apps/verify restarted
< HTTP/2 302
< date: Mon, 13 Feb 2023 22:55:54 GMT
< content-type: text/html; charset=UTF-8
< content-length: 1411
< location: https://sso.example.com/.pomerium/sign_in?pomerium_expiry=1676329254&pomerium_idp_id=C3HESNcvnqS4eqcjUna9rVax8spGEWe2sC8F65GTt2ip&pomerium_issued=1676328954&pomerium_redirect_uri=https%3A%2F%2Fverify.example.com%2Fhealthcheck&pomerium_signature=3-t-KBAxkDVUDmG29m5yUifNAnatGs_qwvB5cmnl58w%3D
< x-pomerium-intercepted-response: true
< x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
< server: envoy
< x-request-id: 6eda7b91-d88f-46e0-be2c-b265fcbebe88
<
< HTTP/2 404
< date: Mon, 13 Feb 2023 22:55:54 GMT
< content-type: text/html; charset=UTF-8
< content-length: 1414
< x-pomerium-intercepted-response: true
< x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
< server: envoy
< x-request-id: b53bb012-9725-4174-b7a1-b566934fd3a4
<
< HTTP/2 302
< date: Mon, 13 Feb 2023 22:55:55 GMT
< content-type: text/html; charset=UTF-8
< content-length: 1411
< location: https://sso.example.com/.pomerium/sign_in?pomerium_expiry=1676329255&pomerium_idp_id=C3HESNcvnqS4eqcjUna9rVax8spGEWe2sC8F65GTt2ip&pomerium_issued=1676328955&pomerium_redirect_uri=https%3A%2F%2Fverify.example.com%2Fhealthcheck&pomerium_signature=NsZdyX7CDZJjKGpro7tebeQoGmrI2r53jZzSn_av2Dc%3D
< x-pomerium-intercepted-response: true
< x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
< server: envoy
< x-request-id: 17ab805d-9fac-4eb5-8e3a-8608dc855d36
What's your environment like?
$ kubectl -n ingress get deploy/pomerium -o yaml | yq '.spec.template.spec.containers[0].image'
pomerium/ingress-controller:sha-cdc389c
$ kubectl get nodes -o yaml | yq '.items[-1] | .status.nodeInfo'
architecture: arm64
bootID: 247a7d1c-b579-4f89-b1b6-2b98883e4150
containerRuntimeVersion: docker://20.10.17
kernelVersion: 5.4.226-129.415.amzn2.aarch64
kubeProxyVersion: v1.21.14-eks-fb459a0
kubeletVersion: v1.21.14-eks-fb459a0
machineID: ec2d3bd9b9ea252972a242d7da68e233
operatingSystem: linux
osImage: Amazon Linux 2
systemUUID: ec2d3bd9-b9ea-2529-72a2-42d7da68e233
What's your config.yaml?
apiVersion: ingress.pomerium.io/v1
kind: Pomerium
metadata:
name: global
spec:
authenticate:
url: https://sso.example.com
certificates:
- ingress/tls-wildcards
identityProvider:
provider: google
secret: ingress/google-idp-creds
secrets: ingress/bootstrap
status:
ingress:
ingress/verify:
observedAt: "2023-02-13T22:55:54Z"
observedGeneration: 2
reconciled: true
sandbox/example-ingress:
observedAt: "2023-02-13T22:28:14Z"
observedGeneration: 6
reconciled: true
settingsStatus:
observedAt: "2023-02-10T15:33:57Z"
observedGeneration: 5
reconciled: true
warnings:
- 'storage: please specify a persistent storage backend, please see https://www.pomerium.com/docs/topics/data-storage#persistence'
What did you see in the logs?
{
"level": "info",
"service": "envoy",
"upstream-cluster": "",
"method": "GET",
"authority": "verify.ops.bereal.me",
"path": "/healthcheck",
"user-agent": "curl/7.81.0",
"referer": "",
"forwarded-for": "71.254.0.45,10.123.60.7",
"request-id": "fb2011be-f4d0-47bd-9094-8ad12f583009",
"duration": 14.398539,
"size": 1414,
"response-code": 404,
"response-code-details": "ext_authz_denied",
"time": "2023-02-14T01:34:20Z",
"message": "http-request"
}