ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

Lua: Fix `limit_except` returning 503.

Open ZJfans opened this issue 1 year ago • 24 comments
trafficstars

What this PR does / why we need it:

Using limit_except GET { deny all; } together with location = / { return 403; } , When the request is POST ,results in 503, instead of 403. This issue involves the ngx_http_core_module.c module of Nginx. When the limit_except directive is used, variables set within a location block cannot be accessed. Currently, the balancer.rewrite() in nginx.conf relies on the proxy_upstream_name variable, which leads to a failure in obtaining the balancer and results in a 503 error. However, it should actually return a 403 error. If you need more details please let me know

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] CVE Report (Scanner found CVE and adding report)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation only

Which issue/s this PR fixes

fixes #11742

How Has This Been Tested?

kind: Ingress
metadata:
  name: foo-ingress
  namespace: default
  annotations:
    nginx.ingress.kubernetes.io/configuration-snippet: limit_except GET { deny all; }
    nginx.ingress.kubernetes.io/server-snippet: |
      location  =/ {
        return 403;
      }
spec:
  ingressClassName: nginx
  rules:
  - host:
    http:
      paths:
      - path: /foo
        pathType: Prefix
        backend:
          service:
            name: foo-service
            port:
              number: 8080
curl -X POST http://1xx.xxx.xxx.xx:32080/foo

<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>

Checklist:

  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [ ] I've read the CONTRIBUTION guide
  • [ ] I have added unit and/or e2e tests to cover my changes.
  • [ ] All new and existing tests passed.

ZJfans avatar Aug 24 '24 18:08 ZJfans

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ZJfans Once this PR has been reviewed and has the lgtm label, please assign rikatz for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Aug 24 '24 18:08 k8s-ci-robot

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: ZJfans / name: zhangjie (8104914fc71a701a06aef7ab04d8f1730b9e8d98, 72ff20ff478923b98d1c439dd569ae6745e88095)

Welcome @ZJfans!

It looks like this is your first PR to kubernetes/ingress-nginx 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/ingress-nginx has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Aug 24 '24 18:08 k8s-ci-robot

Hi @ZJfans. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Aug 24 '24 18:08 k8s-ci-robot

Deploy Preview for kubernetes-ingress-nginx canceled.

Name Link
Latest commit 8104914fc71a701a06aef7ab04d8f1730b9e8d98
Latest deploy log https://app.netlify.com/sites/kubernetes-ingress-nginx/deploys/6735276dfe580a00080d7fba

netlify[bot] avatar Aug 24 '24 18:08 netlify[bot]

We are currently trying to refrain from snippets and/or Lua at all, so I'm not sure if it makes sense to further extend these parts of the codebase.

@strongjz @rikatz @tao12345666333 WDYT?

Gacko avatar Aug 25 '24 06:08 Gacko

We are currently trying to refrain from snippets and/or Lua at all, so I'm not sure if it makes sense to further extend these parts of the codebase.

@strongjz @rikatz @tao12345666333 WDYT?

Sorry, I didn't realize that Lua is being removed at the moment. If this is not necessary, we can close this request.

ZJfans avatar Aug 25 '24 10:08 ZJfans

Lua is not being removed, but reduced and we'd like to replace it by NJS (https://nginx.org/en/docs/njs) or something similar in the long run - if even required.

Currently I can not tell if your PR is just a fix or a feature as I'm not so deep into Lua, yet. I can only tell that - what you already noticed - we'd like to reduce it and - for security reasons - get rid of snippets. Which brings me back to the "feature or fix" question as this issue only occurs in conjunction with snippets, right?

Gacko avatar Aug 25 '24 10:08 Gacko

It's a fix, this is my analysis, not English, need translation (https://zjfans.github.io/2024/08/25/Problems%20with%20ingress-nginx%20using%20limit_except/)

ZJfans avatar Aug 25 '24 12:08 ZJfans

Do I understand correctly, that in case of a forbidden request method NGINX does not enter the location block and therefore does not set the proxy upstream name, but still enters the get_balancer function? Because then we should rather just do the same nil check as a few lines below and also return nil instead of answering the request with 403.

At least this Lua function does not seem to be the right place for generating the final response to me.

Gacko avatar Aug 26 '24 21:08 Gacko

/triage accepted /kind bug /priority backlog /hold

Gacko avatar Aug 26 '24 21:08 Gacko

/cherry-pick release-1.10

Gacko avatar Aug 26 '24 21:08 Gacko

/cherry-pick release-1.11

Gacko avatar Aug 26 '24 21:08 Gacko

@Gacko: once the present PR merges, I will cherry-pick it on top of release-1.10 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@Gacko: once the present PR merges, I will cherry-pick it on top of release-1.11 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

/ok-to-test

Gacko avatar Aug 26 '24 21:08 Gacko

I would like as well an e2e test for this situation returning the right behavior.

rikatz avatar Aug 26 '24 23:08 rikatz

Do I understand correctly, that in case of a forbidden request method NGINX does not enter the location block and therefore does not set the proxy upstream name, but still enters the get_balancer function? Because then we should rather just do the same nil check as a few lines below and also return nil instead of answering the request with 403.

At least this Lua function does not seem to be the right place for generating the final response to me.

Exactly, for the reasons you mentioned. I will add tests, but I'm not very familiar with this, so it will take me some time.

ZJfans avatar Aug 27 '24 02:08 ZJfans

@Gacko I have a question, e2e does not allow post, but I need to test get and post, can you give me some advice

f.HTTPTestClient().
		    POST("/").
		    WithHeader("Host", host).
		    Expect().
		    Status(http.StatusForbidden)
run golangci-lint
  Running [/home/runner/golangci-lint-1.56.2-linux-amd64/golangci-lint run] in [/home/runner/work/ingress-nginx/ingress-nginx] ...
  test/e2e/annotations/affinity.go:1: : # k8s.io/ingress-nginx/test/e2e/annotations
  Error: test/e2e/annotations/limitexcept.go:60:7: f.HTTPTestClient().POST undefined (type *httpexpect.HTTPRequest has no field or method POST)
  Error: test/e2e/annotations/limitexcept.go:72:7: f.HTTPTestClient().POST undefined (type *httpexpect.HTTPRequest has no field or method POST) (typecheck)
  /*
  Error: test/e2e/e2e.go:[33](https://github.com/kubernetes/ingress-nginx/actions/runs/10597984395/job/29369438735?pr=11860#step:5:35):4: could not import k8s.io/ingress-nginx/test/e2e/annotations (-: # k8s.io/ingress-nginx/test/e2e/annotations
  Error: test/e2e/annotations/limitexcept.go:60:7: f.HTTPTestClient().POST undefined (type *httpexpect.HTTPRequest has no field or method POST)
  Error: test/e2e/annotations/limitexcept.go:72:7: f.HTTPTestClient().POST undefined (type *httpexpect.HTTPRequest has no field or method POST)) (typecheck)
  	_ "k8s.io/ingress-nginx/test/e2e/annotations"
  	  ^
  
  Error: issues found
  Ran golangci-lint in 86874ms

ZJfans avatar Aug 30 '24 13:08 ZJfans

We had a chat about this PR in our Community Meeting and came to the conclusion that this is not the ideal place to add this kind of error handling.

The function you're adding it to is meant for determining the upstream to use and not for making a final decision about the HTTP status code, especially not a 403.

I cannot tell if there is a more proper way to handle this as, from the top of my head, I don't know if this issue is more about the order NGINX is executing Lua and/or plain directives in the nginx.conf or could maybe solved by just not returning a 503 here, or at least not if the backend name is undefined / set to - from the very beginning.

Gacko avatar Aug 31 '24 07:08 Gacko

/cc @strongjz WDYT?

Gacko avatar Aug 31 '24 07:08 Gacko

It's great that you discussed this issue. I also think this fix is ​​not good enough. But I can't think of a better solution. The good thing is that this error has a small impact, just the expected 403 becomes 503.

I think the order is, 1. If a request is forbidden, nginx updates the configuration of the location block to the limit_except block (excluding the location configuration) 2. Lua gets the variable set in the location block (error occurs) 3. Execute the limit_except logic.

I think it's better to merge the location block configuration instead of simply replacing it. Maybe nginx thinks that the request is forbidden and should not get the original location block configuration again, because it will return 403 anyway. image

Maybe my analysis is wrong, this is interesting, and I look forward to a good solution. And more importantly, limit_except should be used with caution in the future to prevent interference with other location directives.

ZJfans avatar Aug 31 '24 11:08 ZJfans

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

github-actions[bot] avatar Oct 16 '24 02:10 github-actions[bot]

The lifecycle/frozen label can not be applied to PRs.

This bot removes lifecycle/frozen from PRs because:

  • Commenting /lifecycle frozen on a PR has not worked since March 2021
  • PRs that remain open for >150 days are unlikely to be easily rebased

You can:

  • Rebase this PR and attempt to get it merged
  • Close this PR with /close

Please send feedback to sig-contributor-experience at kubernetes/community.

/remove-lifecycle frozen

k8s-triage-robot avatar Oct 16 '24 04:10 k8s-triage-robot