gateway icon indicating copy to clipboard operation
gateway copied to clipboard

Nil pointer exception when enabling http3

Open alviss7 opened this issue 9 months ago • 23 comments

Description: When enabling HTTP/3 through the creation of a ClientTrafficPolicy object, an error occurs in Envoy Gateway, preventing the configuration from being applied. In addition to the error message observed in the logs, Envoy Gateway stops applying any new configuration unless HTTP/3 is disabled.

Repro steps:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
  name: shared-gateway
spec:
  http3: {}
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: shared-gateway

Environment: v1.3.2

Logs:

runtime/debug.Stack()
    /opt/hostedtoolcache/go/1.23.7/x64/src/runtime/debug/stack.go:26 +0x64
github.com/envoyproxy/gateway/internal/message.handleWithCrashRecovery[...].func1()
    /home/runner/work/gateway │/gateway/internal/message/watchutil.go:52 +0x154
panic({0x27b5200?, 0x9f2a750?})
    /opt/hostedtoolcache/go/1.23.7/x64/src/runtime/panic.go:791 +0x124
github.com/envoyproxy/gateway/internal/xds/translator.addXdsTLSInspectorFilter(0x0)
    /home/runner/work/gateway/gateway/internal/xds/translator/listener.go:639 +0x20
github.com/envoyproxy/gateway/internal/xds/translator.a │ddServerNamesMatch(0x0, 0x40029af180, {0x4002a50390, 0x1, 0x1})
    /home/runner/work/gateway/gateway/internal/xds/translator/listener.go:504 +0xc0
github.com/envoyproxy/gateway/internal/xds/translator.(*Translator).addHCMToXDSListener(0x400121d8e8, 0x0, 0x400262d700, 0x28cb?, 0x0, 0x1, 0x0)
    /home/runner/work/gateway/gateway/internal/xds/translator/listener.go:413 +0xed8 │
github.com/envoyproxy/gateway/internal/xds/translator.(*Translator).processHTTPListenerXdsTranslation(0x400121d8e8, 0x40022f3180, {0x4002a502d0, 0x2, 0x400121d448?}, 0x4002a58120, 0x0, 0x0)
    /home/runner/work/gateway/gateway/internal/xds/translator/translator.go:375 +0x37c
github.com/envoyproxy/gateway/internal/xds/translator.(*Translator).Translate(0x400121d8e8, 0x400 │2a4bb00)
    /home/runner/work/gateway/gateway/internal/xds/translator/translator.go:92 +0x68
github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).subscribeAndTranslate.func1({{0x4001b669e0?, 0x7f0e8?}, 0x2?, 0x4002a4bb00?}, 0x400052a310)
    /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:83 +0x210
github.com/envoyproxy/gateway/ │internal/message.handleWithCrashRecovery[...](0x400121df88?, {{0x4001b669e0, 0x2?}, 0x38?, 0x4002a4bb00?}, {{0x2e01d1a?, 0x0?}, {0x2dedd68?, 0x23a94?}}, 0x400052a310?)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:58 +0xdc
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x2e01d1a, 0x7dcd020?}, {0x2dedd68?, 0x400098b788?}}, 0x40 │02488380?, 0x400121df88)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:97 +0x4c0
github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).subscribeAndTranslate(0x40025e3100, {0x7dcd020?, 0x4000f80050?})
    /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:53 +0x84
created by github.com/envoyproxy/gateway/internal │/xds/translator/runner.(*Runner).Start in goroutine 36
    /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:46 +0x1d4
", "error": "runtime error: invalid memory address or nil pointer dereference"}```

alviss7 avatar Apr 03 '25 22:04 alviss7

Hello

It's not fixed yet, here is the new stack Trace

goroutine 205 [running]:
runtime/debug.Stack()

  /opt/hostedtoolcache/go/1.24.2/x64/src/runtime/debug/stack.go:26 +0x5e
github.com/envoyproxy/gateway/internal/message.handleWithCrashRecovery[...].func1()
  /home/runner/work/gateway/gateway/internal/message/watchutil.go:54 +0x1fe
panic({0x337cd60?, 0xb160740?})
  /opt/hostedtoolcache/go/1.24.2/x64/src/runtime/panic.go:792 +0x132
github.com/envoyproxy/gateway/internal/xds/translator.(*Translator).addHCMToXDSListener(0xc0034378e8, 0x0, 0xc0039e3b00, 0x28cb?, 0x0, 0x1, 0x0)
  /home/runner/work/gateway/gateway/internal/xds/translator/listener.go:439 +0x1431
github.com/envoyproxy/gateway/internal/xds/translator.(*Translator).processHTTPListenerXdsTranslation(0xc0034378e8, 0xc000637ce0, {0xc000b71ae8, 0x3, 0x3ab5b71?}, 0xc000bfb620, 0x0, 0x0)
  /home/runner/work/gateway/gateway/internal/xds/translator/translator.go:336 +0x4e5
github.com/envoyproxy/gateway/internal/xds/translator.(*Translator).Translate(0xc0034378e8, 0xc000dc2d20)
  /home/runner/work/gateway/gateway/internal/xds/translator/translator.go:97 +0x125
github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).subscribeAndTranslate.func1({{0xc0017d4930?, 0x53c9627a16aafdf9?}, 0x80?, 0xc000dc2d20?}, 0xc003702ee0)
  /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:90 +0x2ca
github.com/envoyproxy/gateway/internal/message.handleWithCrashRecovery[...](0xc003437fa0?, {{0xc0017d4930, 0x0?}, 0x0?, 0xc000dc2d20?}, {{0x3ab5b71, 0xe}, {0x3aa10a6, 0x6}}, 0xc003702ee0?)
  /home/runner/work/gateway/gateway/internal/message/watchutil.go:60 +0x137
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x3ab5b71, 0x0?}, {0x3aa10a6?, 0x0?}}, 0xc0036d8770?, 0xc003437fa0)
  /home/runner/work/gateway/gateway/internal/message/watchutil.go:99 +0x7b0
github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).subscribeAndTranslate(0xc002738510, 0xc000bb1d60?)
  /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:59 +0x50
created by github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).Start in goroutine 112
  /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:52 +0x2df

nfarhadian avatar May 16 '25 17:05 nfarhadian

@nfarhadian which version are you on ?

arkodg avatar May 16 '25 18:05 arkodg

@arkodg I see the same thing, I'm in version 1.4.0

alviss7 avatar May 16 '25 18:05 alviss7

ptal @Xunzhuo

arkodg avatar May 16 '25 19:05 arkodg

Any updates about this issue?

nfarhadian avatar Jul 02 '25 19:07 nfarhadian

I cannot reproduce this with main branch, can you share a repo yaml file?

my configuration as following:

apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: merged-eg
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: merged-gateway
    namespace: envoy-gateway-system
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: merged-gateway
  namespace: envoy-gateway-system
spec:
  mergeGateways: true
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: shared-gtw
spec:
  gatewayClassName: merged-eg
  listeners:
    - name: http
      protocol: HTTP
      port: 80
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - kind: Secret
            group: ""
            name: example-cert
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
  name: shared-gtw
spec:
  http3: {}
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: shared-gtw
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: backend
spec:
  parentRefs:
    - name: shared-gtw
  rules:
    - backendRefs:
        - group: ""
          kind: Service
          name: infra-backend-v1
          port: 8080
      matches:
        - path:
            type: PathPrefix
            value: /

zirain avatar Jul 28 '25 01:07 zirain

Here is my config:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: main-gateway-config
  namespace: main-gateway
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        replicas: 2
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
  name: enable-http3
  namespace: main-gateway
spec:
  http3: {}
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: eg
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: eg
  namespace: main-gateway
  annotations:
    cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: eg
  infrastructure:
    parametersRef:
      group: gateway.envoyproxy.io
      kind: EnvoyProxy
      name: main-gateway-config
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      namespaces:
        from: All
  - name: example-com-listener
    protocol: HTTPS
    hostname: 'example.com'
    port: 443
    tls:
      mode: Terminate
      certificateRefs:
      - kind: Secret
        name: example-com-cert
    allowedRoutes:
      namespaces:
        from: All
  - name: wildcard-example-com-listener
    protocol: HTTPS
    hostname: '*.example.com'
    port: 443
    tls:
      mode: Terminate
      certificateRefs:
      - kind: Secret
        name: example-com-cert
    allowedRoutes:
      namespaces:
        from: All
  - name: wildcard-subdomain-example-com-listener
    protocol: HTTPS
    hostname: '*.subdomain.example.com'
    port: 443
    tls:
      mode: Terminate
      certificateRefs:
      - kind: Secret
        name: subdomain-example-com-cert
    allowedRoutes:
      namespaces:
        from: All
  - name: tls-passthrough
    port: 443
    protocol: TLS
    tls:
      mode: Passthrough
    allowedRoutes:
      namespaces:
        from: All
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: sample-http-route
  namespace: sample-app
spec:
  parentRefs:
  - name: eg
    namespace: main-gateway
  hostnames:
  - "*.example.com"
  - "example.com"
  rules:
  - backendRefs:
    - name: sample-service
      port: 80
    filters:
    - type: ResponseHeaderModifier
      responseHeaderModifier:
        set:
          - name: server
            value: "envoy"

nfarhadian avatar Jul 28 '25 06:07 nfarhadian

@nfarhadian I cannot repro the issue. Can you retry with main branch(latest version)? and share a full log of envoy gateway controller when it happen.

zirain avatar Jul 28 '25 08:07 zirain

nil pointer exception is gone, but now I get following error: 2025-07-28T10:04:36.430Z ERROR xds-server cache/snapshotcache.go:346 Envoy rejected the last update with code 13 and message Error adding/updating listener(s) main-gateway/eg/example-com-listener-quic: error adding listener '0.0.0.0:10443': filter chain 'main-gateway/eg/wildcard-example-com-listener' has the same matching rules defined as 'main-gateway/eg/example-com-listener'. duplicate matcher is: {}

nfarhadian avatar Jul 28 '25 10:07 nfarhadian

@zhaohuabing I think your QUIC listener fix cannot handle this scenario

nfarhadian avatar Jul 28 '25 10:07 nfarhadian

I think I've found the root cause - if there are multiple listeners on the same port, and only one of the listener enables HTTP3, then we may hit this issue.

zhaohuabing avatar Jul 28 '25 12:07 zhaohuabing

@zirain unassign you and assign myself since I'm working on the xDS translator recently, and this one is related.

zhaohuabing avatar Jul 28 '25 12:07 zhaohuabing

nil pointer exception is gone, but now I get following error: 2025-07-28T10:04:36.430Z ERROR xds-server cache/snapshotcache.go:346 Envoy rejected the last update with code 13 and message Error adding/updating listener(s) main-gateway/eg/example-com-listener-quic: error adding listener '0.0.0.0:10443': filter chain 'main-gateway/eg/wildcard-example-com-listener' has the same matching rules defined as 'main-gateway/eg/example-com-listener'. duplicate matcher is: {}

@nfarhadian the null pointer should have been fixed in https://github.com/envoyproxy/gateway/pull/6584. You found this one after that fix?

zhaohuabing avatar Jul 28 '25 23:07 zhaohuabing

thanks, added https://github.com/envoyproxy/gateway/issues/2423 to the v1.5.0, hoping this regression can be avoided with it

arkodg avatar Jul 29 '25 00:07 arkodg

thanks, added #2423 to the v1.5.0, hoping this regression can be avoided with it

let me add e2e base on the doc.

zirain avatar Jul 29 '25 00:07 zirain

nil pointer exception is gone, but now I get following error: 2025-07-28T10:04:36.430Z ERROR xds-server cache/snapshotcache.go:346 Envoy rejected the last update with code 13 and message Error adding/updating listener(s) main-gateway/eg/example-com-listener-quic: error adding listener '0.0.0.0:10443': filter chain 'main-gateway/eg/wildcard-example-com-listener' has the same matching rules defined as 'main-gateway/eg/example-com-listener'. duplicate matcher is: {}

@nfarhadian the null pointer should have been fixed in #6584. You found this one after that fix?

@zhaohuabing Yes, I have built envoy-gateway today (commit 182b761ad) and I got this error. It's not Nil pointer exceptions, but still HTTP3 does not work.

nfarhadian avatar Jul 29 '25 01:07 nfarhadian

@nfarhadian Thanks. I already reproduced this. I should be able to fix it in v1.5.0.

zhaohuabing avatar Jul 29 '25 01:07 zhaohuabing

For HTTP3/UDP, since TLSInspector doesn't work, we should collapse multiple filter chains on different hosts into a single filter chain.

zhaohuabing avatar Jul 29 '25 02:07 zhaohuabing

Today I have disabled TLS passthrough listener so there shouldn't be TLSInspector but I still got same error. Error mentions matching rules are same for wildcard-example-com-listener and example-com-listener.

nfarhadian avatar Jul 29 '25 17:07 nfarhadian

HTTP3 listeners creates multiple filter chains without matching rules, this was introduced in https://github.com/envoyproxy/gateway/pull/5671. We should merge multiple filter chains for HTTP3 into one, similar to HTTP listeners.

zhaohuabing avatar Jul 30 '25 07:07 zhaohuabing

Would it be possible to add server name match rules for each filter chain (without merging chains, so allowing for multiple hosts/certificates on the same QUIC socket), using Listener's filter_chain_matcher rather than TLSInspector and per-chain filter_chain_match -- similar to what https://github.com/envoyproxy/envoy/issues/39932#issuecomment-2982604203 suggests?

0xADD1E avatar Oct 15 '25 12:10 0xADD1E

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

github-actions[bot] avatar Dec 05 '25 04:12 github-actions[bot]

Not stale. I have this issue too

Jean-Daniel avatar Dec 11 '25 22:12 Jean-Daniel