ngx_upstream_jdomain
ngx_upstream_jdomain copied to clipboard
Stale DNS Lookup Issue
We currently run are using ngx_upstream_jdomain release 1.4.0 as a forwarding proxy for talking to some downstream services that have rotating IP addresses. We run this in Kubernetes clusters in multiple cloud regions. When IP addresses rotate for a subset of our downstreams (for example today in 2/3 regions in 5/24 pods) we see
2023/08/29 08:32:14 [error] 12#12: ngx_http_upstream_jdomain_module: resolver failed, "www.example.com" (110: Operation timed out)
This issue does not recover itself, and we stick on a historic IP address for the service. Our jdomain configs look like
resolver 8.8.8.8 8.8.4.4;
upstream example{
keepalive 32;
keepalive_requests 100;
keepalive_timeout 60s;
jdomain www.example.com port=443 interval=60;
}
We've experienced the issue on both nginx-1.20.1 and 1.23.3.
Any help on recommended next steps or debugging would be appreciated. The error is a timeout error when talking to the DNS server, so I did wonder does ngx_upstream_jdomain try and re-establish a connection to DNS servers when there's a connection issue?
I havent seen it.
And I've used this module under multiple organisations. Not sure how how resolver failures are handled however, likely similarly to in nginx when resolving dynamically.