ngx_upstream_jdomain icon indicating copy to clipboard operation
ngx_upstream_jdomain copied to clipboard

Stale DNS Lookup Issue

Open sg3-141-592 opened this issue 2 years ago • 1 comments
trafficstars

We currently run are using ngx_upstream_jdomain release 1.4.0 as a forwarding proxy for talking to some downstream services that have rotating IP addresses. We run this in Kubernetes clusters in multiple cloud regions. When IP addresses rotate for a subset of our downstreams (for example today in 2/3 regions in 5/24 pods) we see

2023/08/29 08:32:14 [error] 12#12: ngx_http_upstream_jdomain_module: resolver failed, "www.example.com" (110: Operation timed out)

This issue does not recover itself, and we stick on a historic IP address for the service. Our jdomain configs look like

resolver 8.8.8.8 8.8.4.4;

upstream example{
        keepalive 32;
        keepalive_requests 100;
        keepalive_timeout 60s;
        jdomain www.example.com port=443  interval=60;
}

We've experienced the issue on both nginx-1.20.1 and 1.23.3.

Any help on recommended next steps or debugging would be appreciated. The error is a timeout error when talking to the DNS server, so I did wonder does ngx_upstream_jdomain try and re-establish a connection to DNS servers when there's a connection issue?

sg3-141-592 avatar Aug 29 '23 10:08 sg3-141-592

I havent seen it.

And I've used this module under multiple organisations. Not sure how how resolver failures are handled however, likely similarly to in nginx when resolving dynamically.

splitice avatar Sep 04 '23 09:09 splitice