nginx-gateway-fabric
nginx-gateway-fabric copied to clipboard
zone "invalid-backend-ref" is too small
Describe the bug the invalid-backend-ref size of 32k is too small for some environments causing an inability of the gateway to program.
To Reproduce
- provision k8s cluster (I used k3s with pretty bare configuration -
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --disable=traefik" K3S_NODE_NAME="${MASTER_NODE_NAME}" sh -s - --flannel-backend=wireguard-native --token ${TOKEN} --write-kubeconfig-mode 600) - install gateway to cluster -
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/standard-install.yaml - install ngf via helm -
helm install nginx-gateway oci://ghcr.io/nginxinc/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway(also tried with manifests directly) with no modifications
kubectl describe gtw gateway
...
Message: The Gateway is not programmed due to a failure to reload nginx with the configuration. Please see the nginx container logs for any possible configuration issues
kubectl logs <ngf-pod> -n nginx-gateway -c nginx
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: /etc/nginx/conf.d/default.conf is not a file or does not exist
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2024/04/01 20:41:52 [notice] 20#20: using the "epoll" event method
2024/04/01 20:41:52 [notice] 20#20: nginx/1.25.4
2024/04/01 20:41:52 [notice] 20#20: built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r10)
2024/04/01 20:41:52 [notice] 20#20: OS: Linux 6.6.20+rpt-rpi-2712
2024/04/01 20:41:52 [notice] 20#20: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/04/01 20:41:52 [notice] 20#20: start worker processes
2024/04/01 20:41:52 [notice] 20#20: start worker process 39
2024/04/01 20:41:52 [notice] 20#20: start worker process 40
2024/04/01 20:41:52 [notice] 20#20: start worker process 41
2024/04/01 20:41:52 [notice] 20#20: start worker process 42
2024/04/01 20:41:52 [notice] 20#20: signal 1 (SIGHUP) received from 7, reconfiguring
2024/04/01 20:41:52 [notice] 20#20: reconfiguring
2024/04/01 20:41:52 [emerg] 20#20: zone "invalid-backend-ref" is too small in /etc/nginx/conf.d/http.conf:19
2024/04/01 20:42:52 [notice] 20#20: signal 1 (SIGHUP) received from 7, reconfiguring
2024/04/01 20:42:52 [notice] 20#20: reconfiguring
2024/04/01 20:42:52 [emerg] 20#20: zone "invalid-backend-ref" is too small in /etc/nginx/conf.d/http.conf:19
Expected behavior Either the smallest possible zone size is determined and implemented at runtime, or something like 128k - 512k is used instead here: https://github.com/nginxinc/nginx-gateway-fabric/blob/03e24fed91d9a39a626bdfaa83108a89824a3d6e/internal/mode/static/nginx/config/upstreams.go#L25
Your environment
- Version of the NGINX Gateway Fabric - 1.2.0
- Version of Kubernetes - v1.28.8+k3s1
- Kubernetes platform (e.g. Mini-kube or GCP) - k3s
- Details on how you expose the NGINX Gateway Fabric Pod (e.g. Service of type LoadBalancer or port-forward) - LoadBalancer (default)
- Logs of NGINX container:
kubectl -n nginx-gateway logs -l app=nginx-gateway -c nginx
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: /etc/nginx/conf.d/default.conf is not a file or does not exist
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2024/04/01 20:41:52 [notice] 20#20: using the "epoll" event method
2024/04/01 20:41:52 [notice] 20#20: nginx/1.25.4
2024/04/01 20:41:52 [notice] 20#20: built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r10)
2024/04/01 20:41:52 [notice] 20#20: OS: Linux 6.6.20+rpt-rpi-2712
2024/04/01 20:41:52 [notice] 20#20: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/04/01 20:41:52 [notice] 20#20: start worker processes
2024/04/01 20:41:52 [notice] 20#20: start worker process 39
2024/04/01 20:41:52 [notice] 20#20: start worker process 40
2024/04/01 20:41:52 [notice] 20#20: start worker process 41
2024/04/01 20:41:52 [notice] 20#20: start worker process 42
2024/04/01 20:41:52 [notice] 20#20: signal 1 (SIGHUP) received from 7, reconfiguring
2024/04/01 20:41:52 [notice] 20#20: reconfiguring
2024/04/01 20:41:52 [emerg] 20#20: zone "invalid-backend-ref" is too small in /etc/nginx/conf.d/http.conf:19
2024/04/01 20:42:52 [notice] 20#20: signal 1 (SIGHUP) received from 7, reconfiguring
2024/04/01 20:42:52 [notice] 20#20: reconfiguring
2024/04/01 20:42:52 [emerg] 20#20: zone "invalid-backend-ref" is too small in /etc/nginx/conf.d/http.conf:19
- NGINX Configuration:
kubectl -n nginx-gateway exec <gateway-pod> -c nginx -- nginx -T
2024/04/01 20:43:24 [emerg] 50#50: zone "invalid-backend-ref" is too small in /etc/nginx/conf.d/http.conf:19
nginx: [emerg] zone "invalid-backend-ref" is too small in /etc/nginx/conf.d/http.conf:19
nginx: configuration file /etc/nginx/nginx.conf test failed
command terminated with exit code 1
Additional context
using a shell session, vi, and nginx -T, I was able to determine that the smallest zone size it would accept was 128k
current workaround is to set nginxGateway.image.tag to 1.1.0
Hi @LongStoryMedia! Welcome to the project! 🎉
Thanks for opening this issue! Be sure to check out our Contributing Guidelines and the Issue Lifecycle while you wait for someone on the team to take a look at this.
Hi @LongStoryMedia !
Couple of clarifying points if you don't mind:
- You state the NGF version is 1.0.0 but your workaround is to pin the image to 1.1.0 - can I just confirm that you are only seeing the issue on v1.2.0?
- Could you give any more details about your underlying infrastructure? We haven't run into this issue in any of our test environments and I'd love to understand a bit more about what's different about your setup!
Thanks so much!
Hey @ciarams87 ,
- yes, that is correct, the issue is on v1.2.0. Apologies. I've updated the comment.
- I'm seeing this on my small home cluster, which is 3 raspberry pi 5s booting from ssd m.2 nvme drives. Kubernetes deployed using k3s with the default ingress controller stripped out - so pretty much just vanilla k8s. They are all running raspberry pi os lite (latest) - specs: System: 64-bit; Kernel version: 6.6; Debian version: 12 (bookworm)
Some implementation notes:
If there 0 Endpoints, NGF will generate:
upstream default_coffee_80 {
random two least_conn;
zone default_coffee_80 512k;
server unix:/var/lib/nginx/nginx-502-server.sock;
}
if backend doesn't exist (svc doesn't exist), NGINX will generate (for all such cases):
upstream invalid-backend-ref {
random two least_conn;
zone invalid-backend-ref 32k;
server unix:/var/lib/nginx/nginx-500-server.sock;
}
So upstream invalid-backend-ref is only used once.
One approach to increase the zone is to just not specify. In that case, NGINX will not share upstream state across backends. Which is ok for that upstream, since we only proxy to one destination.
https://nginx.org/en/docs/http/ngx_http_upstream_module.html#zone