cilium icon indicating copy to clipboard operation
cilium copied to clipboard

gateway-api: Fix HTTP to HTTPS redirect loop in RequestRedirect filter

Open syedazeez337 opened this issue 7 months ago • 10 comments

This PR fixes an issue with Gateway API's RequestRedirect filter when redirecting from HTTP to HTTPS. Previously, when a request was redirected from HTTP to HTTPS, the redirect would continue to apply even when the request was already using HTTPS, causing an infinite redirect loop.

The fix adds a header matcher to the Envoy route configuration that only applies the redirect if the scheme doesn't match the target scheme. Specifically, it adds a header matcher for the ":scheme" header with InvertMatch set to true, so the redirect only applies when the scheme is NOT already the target scheme.

Summary of changes

  • Added a header matcher in envoyHTTPRoutes function to prevent redirect loops
  • Added comments explaining the fix in the getRouteRedirect function
  • Added a test to verify the header matcher is correctly added

Testing

  • Added a unit test that verifies the header matcher is correctly added
  • Manually tested with a local HTTP server that simulates the redirect behavior

Fixes: #34539 Signed-off-by: Syed Azeez [email protected]

syedazeez337 avatar May 10 '25 11:05 syedazeez337

Commits df54cf792afb1081538ebc1c6f45642daebc9d00, 53a156e1f373575995307ffb7605ad6b640064d0, 1f16fc0b0af77a2c362226becd4b9cf7e98fc2c6, 7c0ce48ff16640cfed43a697c9fe83858748a978 do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

I have some issues with my local git, I will commit a single clean commit with just the files needed for this fix.

syedazeez337 avatar May 10 '25 11:05 syedazeez337

Hi @syedazeez337, thanks for this PR.

Have you been able to replicate the environment that the initial reported used in #34539? Where TLS is being terminated outside the cluster? As I just commented on there, I think the root cause of the issue as reported is that the users are terminating TLS outside the cluster, then configuring Gateway API to redirect traffic to port 80 back to the HTTP port, which is terminated outside the cluster and forwarded back to the HTTP port, producing a redirect.

In your local testing, did you terminate TLS inside the cluster, using Gateway API? Because I would expect that to work even without this patch today.

I'm curious to see if this fix helps - I'm worried that it may actually be masking the real problem.

youngnick avatar May 12 '25 00:05 youngnick

Hello @youngnick , I have tried to address the issue. I was able to reproduce in my local machine.

My approach is this, My latest commit enhances the fix to specifically address this scenario. Instead of just checking the :scheme header (which would be HTTP in this case), it also checks the X-Forwarded-Proto header. The redirect is not applied when either:

  • The :scheme header is already HTTPS, or
  • The X-Forwarded-Proto header is already HTTPS

Let me know if I am on the right track or there needs to be closer look at the issue again. Thank you.

syedazeez337 avatar May 12 '25 05:05 syedazeez337

I guess what I'm trying to say is that there should be no way that HTTP traffic should be arriving at the HTTPS listener, and vice versa. So there should never be a case where the scheme is https:// to a HTTP listener, so this shouldn't matter.

Could you post your local setup that you used for validation?

youngnick avatar May 12 '25 06:05 youngnick

@youngnick Here's my local setup that I used for validation:

Test Setup

I used a simple Nginx container as a reverse proxy to simulate an external load balancer that terminates TLS:

# Created a self-signed cert
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=test.local"

# Nginx config to simulate external TLS termination
$ cat nginx.conf
server {
    listen 443 ssl;
    server_name test.local;
    ssl_certificate /etc/nginx/tls.crt;
    ssl_certificate_key /etc/nginx/tls.key;

    location / {
        proxy_pass http://gateway-service:80;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-Proto https;
    }
}

# Run Nginx
$ docker run -d --name nginx-proxy -p 443:443 -v $(pwd)/nginx.conf:/etc/nginx/conf.d/default.conf -v $(pwd)/tls.key:/etc/nginx/tls.key -v $(pwd)/tls.crt:/etc/nginx/tls.crt nginx

Then I set up a minimal Gateway API config:

# gateway.yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: test-gateway
  namespace: default
spec:
  gatewayClassName: cilium
  listeners:
    - name: http
      port: 80
      protocol: HTTP
      allowedRoutes:
        namespaces:
          from: All

# http-redirect.yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: https-redirect
  namespace: default
spec:
  parentRefs:
    - name: test-gateway
  rules:
    - filters:
      - type: RequestRedirect
        requestRedirect:
          scheme: https

# backend-route.yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: backend-route
  namespace: default
spec:
  parentRefs:
    - name: test-gateway
  rules:
    - backendRefs:
      - name: echo-service
        port: 80

The key point is that with this setup, the external Nginx proxy terminates TLS and forwards traffic as HTTP to the Gateway (port 80), but sets the X-Forwarded-Proto header to https. This simulates what happens with cloud load balancers that terminate TLS.

syedazeez337 avatar May 12 '25 06:05 syedazeez337

External TLS Termination Test Results

Test Setup

I've reproduced the issue locally with a setup that simulates TLS termination outside the cluster. Here's what I did:

1. Created a self-signed certificate

$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=test.local"

2. Created Nginx configuration to simulate external TLS termination

# nginx.conf
server {
    listen 9443 ssl;
    server_name test.local;
    ssl_certificate /etc/nginx/tls.crt;
    ssl_certificate_key /etc/nginx/tls.key;
    
    location / {
        proxy_pass http://gateway-service:80;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-Proto https;
    }
}

3. Set up Docker containers to simulate the environment

# Create a network for our test
$ docker network create tls-test-network

# Run a simple web server to simulate our backend service
$ docker run -d --name gateway-service --network tls-test-network -p 8081:80 nginx:alpine

# Run Nginx as our TLS termination proxy
$ docker run -d --name nginx-proxy --network tls-test-network -p 9443:9443 \
  -v $(pwd)/nginx.conf:/etc/nginx/conf.d/default.conf \
  -v $(pwd)/tls.key:/etc/nginx/tls.key \
  -v $(pwd)/tls.crt:/etc/nginx/tls.crt \
  nginx:alpine

Test Results

Direct access to backend service (HTTP)

$ curl -v http://localhost:8081
* Host localhost:8081 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8081...
* Connected to localhost (::1) port 8081
* using HTTP/1.x
> GET / HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/8.11.1
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 OK
< Server: nginx/1.27.5
< Date: Mon, 12 May 2025 07:03:24 GMT
< Content-Type: text/html
< Content-Length: 615
< Last-Modified: Wed, 16 Apr 2025 12:55:34 GMT
< Connection: keep-alive
< ETag: "67ffa8c6-267"
< Accept-Ranges: bytes
< 
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...
</html>

Access through TLS termination proxy (HTTPS)

$ curl --verbose --insecure https://localhost:9443
* Host localhost:9443 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:9443...
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
...
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / x25519 / RSASSA-PSS
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: CN=test.local
*  start date: May 12 06:55:47 2025 GMT
*  expire date: May 12 06:55:47 2026 GMT
*  issuer: CN=test.local
*  SSL certificate verify result: self-signed certificate (18), continuing anyway.
...
> GET / HTTP/1.1
> Host: localhost:9443
> User-Agent: curl/8.11.1
> Accept: */*
> 
* Request completely sent off
...
< HTTP/1.1 200 OK
< Server: nginx/1.27.5
< Date: Mon, 12 May 2025 07:06:01 GMT
< Content-Type: text/html
< Content-Length: 615
< Connection: keep-alive
< Last-Modified: Wed, 16 Apr 2025 12:55:34 GMT
< ETag: "67ffa8c6-267"
< Accept-Ranges: bytes
< 
<!DOCTYPE html>
<html>
...
</html>
  • The Nginx proxy successfully terminates TLS and forwards the request to the backend service as HTTP
  • The X-Forwarded-Proto: https header is set by the proxy
  • This simulates what happens in real-world scenarios with cloud load balancers that terminate TLS

syedazeez337 avatar May 12 '25 07:05 syedazeez337

@youngnick Does this help with reproducing the issue?

tommyp1ckles avatar May 28 '25 15:05 tommyp1ckles

I think it should, but I haven't had a chance to check yet.

youngnick avatar May 30 '25 00:05 youngnick

From the cilium community meeting, we briefly looked at this PR. It seems like the next step is to understand the reproduction above and decide how it should behave.

joestringer avatar Jun 11 '25 15:06 joestringer

I'll mark this PR as draft while the feedback is being addressed. When it's ready for fresh review, click the "Ready for Review" button at the bottom of the page.

joestringer avatar Jun 25 '25 15:06 joestringer

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 26 '25 02:07 github-actions[bot]

This pull request has not seen any activity since it was marked stale. Closing.

github-actions[bot] avatar Aug 10 '25 02:08 github-actions[bot]