ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

Ingress resources with long regular expressions break the external authentication

Open maxlaverse opened this issue 2 years ago • 8 comments

What happened: We're using an Ingress resource with a regular expression that looks like /base-path/(pattern1|pattern2|pattern3|...)(/.*)?. In some environments it has authentication on top of it (nginx.ingress.kubernetes.io/auth-url), and in others the endpoints are publicly available.

When adding more patterns to the regular expression, querying those URLs eventually start failings with 500 Internal Server Error in the environments that have authentication on. Splitting the regular expression into two different Ingress paths solves the problem.

What you expected to happen: We would expect the Ingress resource to be routed to the right backend, independently of its size and if authentication is enabled or not.

What do you think went wrong? When looking at the logs around a 500, we can see that Nginx fails to reach the authentication URL

2023/01/05 09:28:04 [error] 2293#2293: *115146047 auth request unexpected status: 404 while sending to client, client: 77.20.222.149, server: some-server, request: "GET /my/url HTTP/2.0", host: "some-server"

Within its configuration, ingress-nginx uses location to expose the authentication backend internally under the same domain.

location = /_external-auth-L2Jhc2UtcGF0aC8ocGF0dGVybjF8cGF0dGVybjJ8cGF0dGVybjN8Li4uKSgvLiopPw-Prefix {
...
}

And later reference it in the location block corresponding to the Ingress resource:

auth_request        /_external-auth-L2Jhc2UtcGF0aC8ocGF0dGVybjF8cGF0dGVybjJ8cGF0dGVybjN8Li4uKSgvLiopPw-Prefix;

This long string is the base64 URL encoded value of the path of the Ingress resource, built in buildLocation. The longer the path, the longer this "external-auth" location becomes.

We created ticket #2435 in Nginx's issue tracker because we believe there is a bug where Nginx doesn't handle long location properly. That would lead to the external authentication to be broken as soon as its base64 encoded string becomes longer than 255 characters.

We're creating this GitHub issue to keep track of it and give people a chance to find it. Ideally, Ingress-Nginx would prevent those Ingress resources from being created in the first place, through the admission controller. Eventually, we could also wait for Nginx to publish a new release.

NGINX Ingress controller version:

NGINX Ingress controller Release: v1.1.3 Build: 9d3a285f19a704524439c75b947e2189406565ab Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.19.10


Kubernetes version (use kubectl version): Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.1", GitCommit:"e4d4e1ab7cf1bf15273ef97303551b279f0920a9", GitTreeState:"clean", BuildDate:"2022-09-14T19:40:59Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"darwin/arm64"} Kustomize Version: v4.5.7 Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.14-eks-fb459a0", GitCommit:"b07006b2e59857b13fe5057a956e86225f0e82b7", GitTreeState:"clean", BuildDate:"2022-10-24T20:32:54Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

Environment: Please let me know if this is relevant and I'll add the information

How to reproduce this issue:

Install minikube/kind

  • ~~Minikube https://minikube.sigs.k8s.io/docs/start/~~
  • ~~Kind https://kind.sigs.k8s.io/docs/user/quick-start/~~

I used Docker for Desktop. Let me know if you can't reproduce with Minikube or Kind

Install the ingress controller

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/baremetal/deploy.yaml

Install an application that will act as default backend (is just an echo app)

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/docs/examples/http-svc.yaml

Note: the Pod were crashing with Lua JIT errors. I replaced the image of the http-svc Deployment with k8s.gcr.io/defaultbackend-arm64:1.5

Create an ingress

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/configuration-snippet: |2-
      return 200;
  name: fake-oauth
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: http-svc
            port:
              number: 80
        path: /oauth2/auth
        pathType: ImplementationSpecific
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/auth-url: http://localhost/oauth2/auth
    nginx.ingress.kubernetes.io/use-regex: "true"
  name: working-short-regexp-with-auth
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: http-svc
            port:
              number: 80
        path: /base-path1/(pattern1|pattern2|pattern3)(/.*)?
        pathType: ImplementationSpecific
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/use-regex: "true"
  name: working-long-regexp-without-auth
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: http-svc
            port:
              number: 80
        path: /base-path2/(pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|)(/.*)?
        pathType: ImplementationSpecific
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/auth-url: http://localhost/oauth2/auth
    nginx.ingress.kubernetes.io/use-regex: "true"
  name: broken-long-regexp-with-auth
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: http-svc
            port:
              number: 80
        path: /base-path3/(pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|pattern1|pattern2|pattern3|)(/.*)?
        pathType: ImplementationSpecific

make a request

POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl http://localhost/base-path1/pattern1
kubectl exec -it -n ingress-nginx $POD_NAME -- curl http://localhost/base-path2/pattern1
kubectl exec -it -n ingress-nginx $POD_NAME -- curl http://localhost/base-path3/pattern1

The third call returns:

<html>
<head><title>500 Internal Server Error</title></head>
<body>
<center><h1>500 Internal Server Error</h1></center>
<hr><center>nginx</center>
</body>
</html>

In the logs we can see:

127.0.0.1 - - [06/Jan/2023:12:47:57 +0000] "GET /oauth2/auth HTTP/1.1" 200 0 "-" "curl/7.83.1" 365 0.000 [default-http-svc-80] [] - - - - 73da2754b4edb6628fb9b5b66798c376
127.0.0.1 - - [06/Jan/2023:12:47:57 +0000] "GET /base-path1/pattern1 HTTP/1.1" 200 0 "-" "curl/7.83.1" 0 0.010 [default-http-svc-80] [] 127.0.0.1:80 0 0.001 200 73da2754b4edb6628fb9b5b66798c376
127.0.0.1 - - [06/Jan/2023:12:47:57 +0000] "GET /base-path1/pattern1 HTTP/1.1" 404 21 "-" "curl/7.83.1" 92 0.010 [default-http-svc-80] [] 10.1.0.51:8080 21 0.000 404 73da2754b4edb6628fb9b5b66798c376
127.0.0.1 - - [06/Jan/2023:12:48:00 +0000] "GET /base-path2/pattern1 HTTP/1.1" 404 21 "-" "curl/7.83.1" 92 0.000 [default-http-svc-80] [] 10.1.0.51:8080 21 0.000 404 7e80e00d455fae7de191fb446e770b9c
127.0.0.1 - - [06/Jan/2023:12:48:01 +0000] "GET /base-path3/pattern1 HTTP/1.1" 500 170 "-" "curl/7.83.1" 92 0.000 [upstream-default-backend] [] - - - - 4fd93a98fd163caa3d517301c5ef0a48
2023/01/06 12:48:01 [error] 1596#1596: *40548 auth request unexpected status: 404 while sending to client, client: 127.0.0.1, server: _, request: "GET /base-path3/pattern1 HTTP/1.1", host: "localhost"

Anything else we need to know:

maxlaverse avatar Jan 06 '23 12:01 maxlaverse

/remove-kind bug /kind feature

@maxlaverse , are you 100% certain that its the length that is the problem and not the return URL or any other such configuration . Reason for asking is your information that splitting a ingress resource into multiple does not cause the problem. This info can be interpreted as combined rules in one ingress stomping on each other while dedicated ingress resources for rules allowing a accurate match of request to rule.

longwuyuan avatar Jan 06 '23 13:01 longwuyuan

@longwuyuan 100% is very strong, but here is a wrap up of the few points that make me relatively confident:

  1. I was initially able to reproduce the problem simply by extending or reducing the regular expression of an existing Ingress, for which the resulting "auth location" would be around 255 characters. Splitting the Ingress is only a workaround.
  2. There is definitely a buffer overflow in Nginx when the location is more than 255 characters long (#2435).
  3. If you follow the reproduction steps one Ingress at a time, I assume this "combining" feature won't run and still, the 3rd case triggers a 500.

If I may, as a user I wouldn't have expected the issue to be classified as a feature. The resolution might need a new feature, but the problem is still a bug in my opinion. The ingress-nginx normally prevents broken configuration from reaching the cluster and with this problem (if you confirm it), anybody can break a working production setup by extending the path of an Ingress resource by a few characters.

The resolution could be simply documenting the problem, but as of now, ingress-nginx doesn't behave like you would expect it to after reading ingress-path-matching and oauth-external-auth.

maxlaverse avatar Jan 06 '23 14:01 maxlaverse

@maxlaverse thank you for hte info on the 255 chars length.

There has been at least one precedent of a field length of 255 chars in the recent past. I will have to search for the issue. But this makes sense now.

Lets wait for that upstream nginx issue update. And once done, kindly do ping here or better in the K8S slack in the ingress-nginx-dev channel. Just FYI, the project uses openresty and patches. And there is plans to make make changes around that too. So we can revisit this after that upstream issue closure.

/priority important-longterm /triage accepted

longwuyuan avatar Jan 06 '23 15:01 longwuyuan

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

github-actions[bot] avatar Mar 14 '23 01:03 github-actions[bot]

Quick update: a patch has been merged upstream of Nginx. Next steps are for Nginx to release a new version, and for ingress-nginx to use it.

maxlaverse avatar Mar 14 '23 07:03 maxlaverse

Quick update: a new version of Nginx has been released, 1.23.4, which fixes this bug

Upgrading the nginx version embedded in ingress-nginx would solve the issue reported here.

maxlaverse avatar Apr 03 '23 14:04 maxlaverse

https://github.com/kubernetes/ingress-nginx/issues/9771#issuecomment-1477725312

longwuyuan avatar Apr 03 '23 15:04 longwuyuan

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Apr 02 '24 15:04 k8s-triage-robot

There is no update for over a year and the Nginx version has been bumped more than once. As such there is no action item being tracked here for the project.

The project needs to focus on Gatway-API implementation, security and reducing the features that are not Ingress-API implied. So this issue is not tracking any action item for the project and just adding to the tally of open issues. Hence closing this issue now.

/close

longwuyuan avatar Sep 09 '24 13:09 longwuyuan

@longwuyuan: Closing this issue.

In response to this:

There is no update for over a year and the Nginx version has been bumped more than once. As such there is no action item being tracked here for the project.

The project needs to focus on Gatway-API implementation, security and reducing the features that are not Ingress-API implied. So this issue is not tracking any action item for the project and just adding to the tally of open issues. Hence closing this issue now.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Sep 09 '24 13:09 k8s-ci-robot