ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

Termination of TLS in Load Balancer with Helm chart config

Open brsolomon-deloitte opened this issue 4 years ago • 15 comments

https://kubernetes.github.io/ingress-nginx/deploy/#aws shows the direct application of a YAML manifest using kubectl apply -f. It is confusing / unspecified whether applying this file will function as a patch on existing resources that have been deployed with the Helm chart.

What is the suggested workflow for configuring termination of TLS in the Load Balancer, when the ingress-nginx Helm chart is used? What does an MCVE config set of values for Helm look like?

There are several scattered threads such as https://github.com/kubernetes/ingress-nginx/issues/1624#issuecomment-342680491 that allude to this, but https://github.com/kubernetes/ingress-nginx/pull/5374 and https://github.com/kubernetes/ingress-nginx/pull/5360 contain very little prescriptive guidance on the proper configuration with the ingress-nginx Helm chart.

Looking at https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.0/deploy/static/provider/aws/deploy-tls-termination.yaml, it looks like that has been generated with something like helm template in the first place, so it would be useful to see what inputs resulted in that output.

brsolomon-deloitte avatar Dec 06 '21 23:12 brsolomon-deloitte

There is also no mention of the change to a port named tohttps on 2443, which seems like it must be specified in .Values.controller.containerPort.

brsolomon-deloitte avatar Dec 06 '21 23:12 brsolomon-deloitte

Edit: Just a little scary to see it is actually done through a shell script

brsolomon-deloitte avatar Dec 07 '21 00:12 brsolomon-deloitte

Recommend https://kubernetes.github.io/ingress-nginx/deploy/#tls-termination-in-aws-load-balancer-nlb

/remove-kind bug /kind documentation /area docs

longwuyuan avatar Dec 07 '21 03:12 longwuyuan

@longwuyuan your reply includes the first link used in my original question, which is the very part of the documentation that this issue is about in the first place.

brsolomon-deloitte avatar Dec 07 '21 14:12 brsolomon-deloitte

Need some clarification here.

These words

It is confusing / unspecified whether applying this file will function as a patch on existing resources that have been deployed with the Helm chart.

seem to imply, that there are some possibilities, of that yaml file, being a patch of sorts, over and above and after a helm based installation of the ingress-nginx controller ?

If I am reading it wrong, I apologise. But if otherwise, then just want to state the obvious that installing using helm and installing using the yaml file are independent alternative ways to install the ingress-nginx controller and there is no patching of resources.

You can run helm template against the chart and sort of get a view of the manifests for the resources involved in running the ingress-controller. And you can peruse the yaml file I linked also to get a view of the resources involved in running the ingress-controller.

Hope there is a solution to this issue soon.

longwuyuan avatar Dec 07 '21 14:12 longwuyuan

seem to imply, that there are some possibilities, of that yaml file, being a patch of sorts, over and above and after a helm based installation of the ingress-nginx controller ?

Correct. So I now understand that it is not intended to function as an idempotent patch. Thank you for clarifying.

Considering that information, it seems like the suggested approach is not idiomatic and falls victim to several anti-patterns:

  • Using the already-templated deploy-tls-termination.yaml is incompatible with an existing Helm-based deployment
  • The docs suggest / imply that I should manually find/replace a couple of specific parameters in the template such as "XX.XX.XX.XX/XX". This is not the programatic way that I would hope to use Kubernetes/Helm
  • The docs leave out mention entirely of what Helm chart parameters would be required to achieve this same effect - namely, those found halfway down a shell script that itself is not mentioned at all in the docs
  • Last but not least, that script contains the following comment: "legacy in-tree service load balancer controller for AWS NLB that has been phased out from Kubernetes mainline". It's unclear what this means. What is an "in-tree service"? Is this approach considered deprecated and should I prefer the AWS Load Balancer Controller instead?

brsolomon-deloitte avatar Dec 07 '21 15:12 brsolomon-deloitte

The recommendation begins and ends with that yaml. That's the curated part and it works.

Users with insight opt to not use that curated yaml. I don't see ambiguity yet but please feel free to submit a PR.

Thanks, ; Long

On Tue, 7 Dec, 2021, 8:37 PM Brad Solomon, @.***> wrote:

seem to imply, that there are some possibilities, of that yaml file, being a patch of sorts, over and above and after a helm based installation of the ingress-nginx controller ?

Correct. So understood that it is not intended to function as an idempotent patch. Thank you for clarifying.

Considering that information, it seems like the suggested approach is not idiomatic and falls victim to several anti-patterns:

  • Using the already-templated deploy-tls-termination.yaml is incompatible with an existing Helm-based deployment
  • The docs suggest / imply that I should manually find/replace a couple of specific parameters in the template such as "XX.XX.XX.XX/XX". This is not the programatic way that I would hope to use Kubernetes/Helm
  • The docs leave out mention entirely of what Helm chart parameters would be required to achieve this same effect - namely, those found halfway down a shell script https://github.com/kubernetes/ingress-nginx/blob/987a721723d9a7849aa25a40e48bd6cad5ac2dc7/hack/generate-deploy-scripts.sh that itself is not mentioned at all in the docs

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/ingress-nginx/issues/8017#issuecomment-988012840, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGZVWSYE4ZVU2BMQ6I2AEDUPYPMLANCNFSM5JP2DUZQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

longwuyuan avatar Dec 07 '21 15:12 longwuyuan

I'm also confused by the structure of the manual. I'm also trying to set up TLS termination in AWS load balancer, but currently, the documentation seems to read as if the only option is downloading and editing a YAML file.

gijsdpg avatar Dec 14 '21 06:12 gijsdpg

Ok, I think I figured it out. I can replicate the deploy-tls-termination.yaml configuration with helm with the following procedure:

$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
$ helm repo update
$ helm template -n ingress-nginx --create-namespace ingress-nginx -f values.yaml ingress-nginx/ingress-nginx manifest.yaml

where values.yaml is a file containing:

controller:
  service:
    externalTrafficPolicy: "Local"
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: '60'
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:us-west-2:XXXXXXXX:certificate/XXXXXX-XXXXXXX-XXXXXXX-XXXXXXXX
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
  config:
    http-snippet: |
      server{
        listen 2443;
        return 308 https://$host$request_uri;
      }
    proxy-real-ip-cidr: XXX.XXX.XXX/XX
    use-forwarded-headers: 'true'
    type: "LoadBalancer"
    ports:
      http:  80
      https: 443

    targetPorts:
      http: tohttps
      https: http

you can use the values file to set your appropriate values.

gijsdpg avatar Dec 14 '21 07:12 gijsdpg

@gijsdpg if you check out the shell script that is used to generate deploy-tls-termination.yaml, you will see a few very slight differences:

controller:
  service:
    type: LoadBalancer
    externalTrafficPolicy: Local
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:us-west-2:XXXXXXXX:certificate/XXXXXX-XXXXXXX-XXXXXXX-XXXXXXXX"
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
    targetPorts:
      http: tohttps
      https: http
  # Configures the ports the nginx-controller listens on
  containerPort:
    http: 80
    https: 80
    tohttps: 2443
  config:
    proxy-real-ip-cidr: XXX.XXX.XXX/XX
    use-forwarded-headers: "true"
    http-snippet: |
      server {
        listen 2443;
        return 308 https://\$host\$request_uri;
      }

(Note the backslashes are only there because they are used in a Bash heredoc and should go away if you are editing YAML directly.)

What threw us off is the need to use a NLB rather than ALB for this.

brsolomon-deloitte avatar Dec 14 '21 14:12 brsolomon-deloitte

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 10 '22 02:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar May 10 '22 03:05 k8s-triage-robot

/remove-lifecycle rotten

mtb-xt avatar May 12 '22 05:05 mtb-xt

Ok, I think I figured it out. I can replicate the deploy-tls-termination.yaml configuration with helm with the following procedure:

$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
$ helm repo update
$ helm template -n ingress-nginx --create-namespace ingress-nginx -f values.yaml ingress-nginx/ingress-nginx manifest.yaml

where values.yaml is a file containing:

controller:
  service:
    externalTrafficPolicy: "Local"
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: '60'
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:us-west-2:XXXXXXXX:certificate/XXXXXX-XXXXXXX-XXXXXXX-XXXXXXXX
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
  config:
    http-snippet: |
      server{
        listen 2443;
        return 308 https://$host$request_uri;
      }
    proxy-real-ip-cidr: XXX.XXX.XXX/XX
    use-forwarded-headers: 'true'
    type: "LoadBalancer"
    ports:
      http:  80
      https: 443

    targetPorts:
      http: tohttps
      https: http

you can use the values file to set your appropriate values.

THANK YOU! I've been wrestling this all night (complicated by the fact we needed ARM64 support, which slightly breaks the manual install methods) and this finally got us there :)

dwelch2344 avatar Jun 10 '22 07:06 dwelch2344

I don't understand why a working version of SSL termination with helm was removed from the README during the migration from the old repo. This is the only thing that I got working:

https://github.com/helm/charts/blob/master/stable/nginx-ingress/README.md#aws-l4-nlb-with-ssl-redirection

  config:
    ssl-redirect: "false" # we use `special` port to control ssl redirection
    server-snippet: |
      listen 8000;
      if ( $server_port = 80 ) {
         return 308 https://$host$request_uri;
      }
  containerPort:
    http: 80
    https: 443
    special: 8000
  service:
    targetPorts:
      http: http
      https: special
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "tcp"
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "your-arn"
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

and installed with: helm upgrade --install -n ingress-nginx --create-namespace ingress-nginx -f values.yaml ingress-nginx/ingress-nginx

The current docs with L7 and SSL termination did not work, only the above. With the L7 solution in the current README, I was only led to more 308 redirects. Hope this helps someone else that got stuck.

alanlekah avatar Aug 24 '22 22:08 alanlekah

  • I think we were/are convinced that the changed deployment docs were good

  • That change happened months ago so this is the first report in months related to AWS deployment docs

  • Can you confirm that the second section of this AWS deployment docs https://kubernetes.github.io/ingress-nginx/deploy/#aws , that relates to terminating TLS on the NLB, instead of terminating on the controller, does not work as documented ?

  • If you are specifically reporting that the helm chart installation does not work for termination of TLS on NLB, then it seems obvious to almost everyone that the volume of customization involved is just too much insanity, when considering the number of providers, multiplied by the number of unique annotations or configurations possible.

  • Additionally, in gitops circles, you will find raw yaml manifest adheres to single-source-of-truth and that is a best practice preferred when compared with having helm versions in addition to git versions

  • There have been several users of helm reporting similar needs for other providers like Digital-Ocean as well. So if you want you can submit a PR to add a section to the Deployment docs for helm usage with specific providers. But there is lack of resources and such docs (and even code) does not get maintenance attention, that is due to it

longwuyuan avatar Aug 24 '22 23:08 longwuyuan

  • Right, I'm confirming that I tried the installation doc above in AWS (https://kubernetes.github.io/ingress-nginx/deploy/#aws) with no luck. I was stuck in the repetitive 308 redirect cycle that was reported before and closed.

  • I completely understand there's way too much customization based on providers.

  • I wouldn't have used the helm if the raw yaml manifest worked as-is but that old README was the only way I could make this work in my configuration even after trying every change recommended here (as well as the deploy.yml in the article above without any changes other than the cert ARN and VPC CIDR of my EKS cluster): https://github.com/kubernetes/ingress-nginx/issues/2724

  • I would happily create a PR for the documentation above if you think that would make sense here

alanlekah avatar Aug 24 '22 23:08 alanlekah

@alanlekah now someone has to test it and so involves aws account and don't know if free tier works etc. But what is critical now is ;

  • The project can not be publishing a broken AWS install procedure
  • Based on above since you have graciously offered to help fix the current doc with a PR, please go ahead with that
  • But the PR must fix the static yaml manifest first and adding a alternative method using helm is left to you as a option
  • Please check the /hack/generate-deploy-scripts.sh and more importantly the values file at /hack/manifest-templates/provider/aws/nlb-with-tls-termination/values.yaml` in the project
  • The script uses the values file to generate provider specific manifests

Once again thank you very much for reporting this and thank you very much for helping fix the doc

I will take your word and label the issue as well as your PR accordingly. But I still don't understand, what precisely is broken in the current NLB Termination doc. If you could describe that in a way that any reader can understand, it will help the much needed stabilization work in progress, for the project

/triage accepted /area stabilization /area docs /priority important-soon /kind bug /assign @alanlekah

longwuyuan avatar Aug 25 '22 00:08 longwuyuan

@tao12345666333 @rikatz @strongjz this latest comment by @alanlekah says our deployment doc for AWS is broken. Do we just use our own personal accounts to test on AWS ?

longwuyuan avatar Aug 25 '22 00:08 longwuyuan

@longwuyuan I'm not suggesting its the deployment doc thats broken. I may be suggesting that the deploy.yml at:

https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.3.0/deploy/static/provider/aws/nlb-with-tls-termination/deploy.yaml

may still have issues. I'm not terribly familiar with what the difference is between the helm chart that generates the working raw yaml and the file at the link above but here's what I noticed and what worked for me:

  • I didn't use the proxy-real-ip-cidr as suggested, I got rid of it entirely
  • I replaced the following http-snippet:
http-snippet: |
    server {
      listen 2443;
      return 308 https://$host$request_uri;
    }

with this one:

server-snippet: "listen 8000;\nif ( $server_port = 80 ) {\n   return 308 https://$host$request_uri;\n}\n"

effectively adding an if condition and changing the port to listen on from 2443 to 8000

  • Added in the same data section (as the http-snippet) the disabling of ssl redirects:
ssl-redirect: "false"
  • Helm doesn't use externalTrafficPolicy: Local like the deploy yaml above does. Cluster / the default value seems to work for me

I generated the working yaml with the following helm template command: helm template -n ingress-nginx --create-namespace ingress-nginx -f values.yaml ingress-nginx/ingress-nginx

The problem:

I created a simple API with a helm deployment listening on port 5000. The helm chart created an ingress with the nginx annotations and a ClusterIP service (also on port 5000 for each). When using the installation guide AS-IS, I get a 308 permanent redirect error in postman.

When using the values.yaml and the helm template using the values.yaml mentioned in the comment above (https://github.com/kubernetes/ingress-nginx/issues/8017#issuecomment-1226542186), I had no issues accessing the endpoint in HTTPS and HTTP redirect was working correctly as well. I have no idea why.

I'm going to test each of those changes in the file listed in the main yml in this repo and see which one actually fixes the issue.

alanlekah avatar Aug 25 '22 00:08 alanlekah

I know this may be cliche, but I don't have access or ideas on how AWS works. Can someone jump into our next call to show the problem? Maybe we can discuss what can be done there :)

rikatz avatar Aug 25 '22 01:08 rikatz

I know this may be cliche, but I don't have access or ideas on how AWS works. Can someone jump into our next call to show the problem? Maybe we can discuss what can be done there :)

Frankly I've tested all the combinations of changes that i've made from this values.yaml applied against the stock deploy.yml and don't know where the issue is in the file from the repo. I just know that what helm generates works.

alanlekah avatar Aug 25 '22 01:08 alanlekah

Attaching the working yml for comparison ( against the non-working one )

deploy.yml.zip

alanlekah avatar Aug 25 '22 01:08 alanlekah

With the latest updates from @alanlekah I think we some idea of the next action item. We need to diff the yaml that works for him and the manifest we publish. Maybe we paid attention to the vanilla aws manifest and messed up the nlb-termination one, as it is in a subdir.

@alanlekah that descriptive update from you was helpful as as I was typing this, I saw you posted the manifest you used. thanks. Please just clarify this. You said you did not use the proxy-real-ip-cidr and you also said you replaced that snippet. I want to confirm that first you generated a yaml with helm template and then you did these actions of deleting proxy-real-ip-cidr annotations & modifying snippet etc. in that file you generated.

longwuyuan avatar Aug 25 '22 01:08 longwuyuan

Something seems very off. For example: https://github.com/kubernetes/ingress-nginx/blob/main/deploy/static/provider/aws/nlb-with-tls-termination/deploy.yaml#L473

See this section in controller-v1.3.0/deploy/static/provider/aws/nlb-with-tls-termination/deploy.yaml:

        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 80
          name: https
          protocol: TCP

Looks like that was fixed in the parent one but not in the TLS termination yaml?

To create the working yaml, I only did the following (I did not replace or add anything). I created the values.yaml:

controller:
  config:
    ssl-redirect: "false" # we use `special` port to control ssl redirection
    server-snippet: |
      listen 8000;
      if ( $server_port = 80 ) {
         return 308 https://$host$request_uri;
      }
  containerPort:
    http: 80
    https: 443
    special: 8000
  service:
    targetPorts:
      http: http
      https: special
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "tcp"
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "<arn>"
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

helm upgrade --install -n ingress-nginx --create-namespace ingress-nginx -f values.yaml ingress-nginx/ingress-nginx

That comparison was just a rough diff I did on the files.

alanlekah avatar Aug 25 '22 02:08 alanlekah

i was about to post the diff so thanks, that is a great catch.

This confirms that our generate-deploy-script needs improvement. I suspect the root cause is the subdir and the script traverses parent dirs and not subdirs.

We will fix it. Thanks again for reporting this and if you find other differences, please let us know.

@Volatus when you get a chance, can you please ping me on slack about this. I think we missed traversing the subdir in generate-deploy-script because aws is a unique case in the sense of having a subdir. We can talk about stuff like making 2 providers of aws or adding a special if condition for aws provider etc.

longwuyuan avatar Aug 25 '22 02:08 longwuyuan

@alanlekah Does port 8443 ring any bells ?

I don't recall seeing port 8000 before but must have missed things so just asking. We will check anyways.

longwuyuan avatar Aug 25 '22 02:08 longwuyuan

@longwuyuan Yes it does. Port 8000 is something I changed (from 2443). I'm not sure if it makes any difference though.

See the old docs for nginx.. https://github.com/helm/charts/tree/master/stable/nginx-ingress#aws-l4-nlb-with-ssl-redirection

8443 is the webhook one AFAIK. I believe the special: 8000 is the same as the current tohttps: 2443 that's in the repo and serves the same purpose.

alanlekah avatar Aug 25 '22 02:08 alanlekah

ok. thanks. once again good pointer on the old doc. the generate-deploy-script went through 3-4 iterations so using that history will be a rabbit hole. I think we are better off basing our fix on the old doc for sanity sake.

longwuyuan avatar Aug 25 '22 02:08 longwuyuan

Something interesting is that turning the helm values.yml file into this also works without issue, so it doesn't make a difference on the name or port number (seems like):

controller:
  config:
    ssl-redirect: "false" # we use `special` port to control ssl redirection
    server-snippet: |
      listen 2443;
      if ( $server_port = 80 ) {
         return 308 https://$host$request_uri;
      }
  containerPort:
    http: 80
    https: 443
    tohttps: 2443
  service:
    targetPorts:
      http: http
      https: tohttps
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "tcp"
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "<arn>"
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

alanlekah avatar Aug 25 '22 02:08 alanlekah