stubby icon indicating copy to clipboard operation
stubby copied to clipboard

Stubby(v0.2.6)DNS resolvers are changed again and again even if the round_robin_upstreams='0'

Open sangs303 opened this issue 1 year ago • 4 comments

(OpenWrt v18.06.8, r7989-82fbd85747)

root@OpenWrt:~# uci show stubby stubby.global=stubby stubby.global.manual='0' stubby.global.trigger='wan' stubby.global.tls_query_padding_blocksize='128' stubby.global.appdata_dir='/var/lib/stubby' stubby.global.edns_client_subnet_private='1' stubby.global.idle_timeout='10000' stubby.global.listen_address='127.0.0.1@5453' '0::1@5453' stubby.global.round_robin_upstreams='0' stubby.global.tls_authentication='GETDNS_AUTHENTICATION_NONE' stubby.global.log_level='7' stubby.global.dns_transport='GETDNS_TRANSPORT_TLS' 'GETDNS_TRANSPORT_TCP' 'GETDNS_TRANSPORT_UDP' stubby.@resolver[0]=resolver stubby.@resolver[0].address='44.194.59.83' stubby.@resolver[0].tls_auth_name='yyy' stubby.@resolver[1]=resolver stubby.@resolver[1].address='100.25.249.20' stubby.@resolver[1].tls_auth_name='xxx' stubby.@resolver[2]=resolver stubby.@resolver[2].address='34.195.80.81' stubby.@resolver[2].tls_auth_name='zzz' stubby.@resolver[3]=resolver stubby.@resolver[3].address='10.12.158.254' stubby.@resolver[3].tls_auth_name='xxx' stubby.@resolver[4]=resolver stubby.@resolver[4].address='10.176.254.254' stubby.@resolver[4].tls_auth_name='xxx' stubby.@resolver[5]=resolver stubby.@resolver[5].address='10.180.254.254' stubby.@resolver[5].tls_auth_name='xxx'

root@OpenWrt:~# stubby -V Stubby 0.2.6

Executed the below command from the OpenWrt terminal,

while true; do dig @127.0.0.1 -p 5453 google.com; sleep 11; done

Logs:



Thu Aug 31 06:53:47 2023 daemon.err stubby[2897]: [06:53:47.130219] STUBBY: 34.195.80.81                             : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 06:53:47 2023 daemon.err stubby[2897]: [06:53:47.130254] STUBBY: 34.195.80.81                             : Upstream   : TLS - Resps=    27, Timeouts  =     2, Best_auth =Success
Thu Aug 31 06:53:47 2023 daemon.err stubby[2897]: [06:53:47.130267] STUBBY: 34.195.80.81                             : Upstream   : TLS - Conns=    18, Conn_fails=     0, Conn_shuts=     18, Backoffs     =     0
Thu Aug 31 06:53:50 2023 daemon.err stubby[2897]: [06:53:50.142211] STUBBY: 44.194.59.83                             : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 06:53:59 2023 daemon.err stubby[2897]: [06:53:59.100064] STUBBY: 44.194.59.83                             : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 06:53:59 2023 daemon.err stubby[2897]: [06:53:59.100098] STUBBY: 44.194.59.83                             : Upstream   : TLS - Resps=    33, Timeouts  =     4, Best_auth =Success
Thu Aug 31 06:53:59 2023 daemon.err stubby[2897]: [06:53:59.100112] STUBBY: 44.194.59.83                             : Upstream   : TLS - Conns=    23, Conn_fails=     0, Conn_shuts=     19, Backoffs     =     0
Thu Aug 31 06:54:00 2023 daemon.err stubby[2897]: [06:54:00.483966] STUBBY: 100.25.249.20                            : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 06:54:10 2023 daemon.err stubby[2897]: [06:54:10.354949] STUBBY: 100.25.249.20                            : Conn closed: TLS - Resps=     3, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 06:54:10 2023 daemon.err stubby[2897]: [06:54:10.354983] STUBBY: 100.25.249.20                            : Upstream   : TLS - Resps=    35, Timeouts  =     5, Best_auth =Success
Thu Aug 31 06:54:10 2023 daemon.err stubby[2897]: [06:54:10.354996] STUBBY: 100.25.249.20                            : Upstream   : TLS - Conns=    21, Conn_fails=     0, Conn_shuts=     19, Backoffs     =     0
Thu Aug 31 06:54:13 2023 daemon.err stubby[2897]: [06:54:13.365225] STUBBY: 100.25.249.20                            : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 06:54:22 2023 daemon.err stubby[2897]: [06:54:22.275697] STUBBY: 100.25.249.20                            : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 06:54:22 2023 daemon.err stubby[2897]: [06:54:22.275730] STUBBY: 100.25.249.20                            : Upstream   : TLS - Resps=    36, Timeouts  =     5, Best_auth =Success
Thu Aug 31 06:54:22 2023 daemon.err stubby[2897]: [06:54:22.275744] STUBBY: 100.25.249.20                            : Upstream   : TLS - Conns=    22, Conn_fails=     0, Conn_shuts=     20, Backoffs     =     0
Thu Aug 31 06:54:25 2023 daemon.err stubby[2897]: [06:54:25.286466] STUBBY: 34.195.80.81                             : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 06:54:34 2023 daemon.err stubby[2897]: [06:54:34.220355] STUBBY: 34.195.80.81                             : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 06:54:34 2023 daemon.err stubby[2897]: [06:54:34.220390] STUBBY: 34.195.80.81                             : Upstream   : TLS - Resps=    28, Timeouts  =     2, Best_auth =Success
Thu Aug 31 06:54:34 2023 daemon.err stubby[2897]: [06:54:34.220403] STUBBY: 34.195.80.81                             : Upstream   : TLS - Conns=    19, Conn_fails=     0, Conn_shuts=     19, Backoffs     =     0
Thu Aug 31 06:54:37 2023 daemon.err stubby[2897]: [06:54:37.233295] STUBBY: 44.194.59.83                             : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 06:54:46 2023 daemon.err stubby[2897]: [06:54:46.224247] STUBBY: 44.194.59.83                             : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 06:54:46 2023 daemon.err stubby[2897]: [06:54:46.224281] STUBBY: 44.194.59.83                             : Upstream   : TLS - Resps=    34, Timeouts  =     4, Best_auth =Success
Thu Aug 31 06:54:46 2023 daemon.err stubby[2897]: [06:54:46.224294] STUBBY: 44.194.59.83                             : Upstream   : TLS - Conns=    24, Conn_fails=     0, Conn_shuts=     20,

sangs303 avatar Aug 31 '23 08:08 sangs303

I have tried the with the Cloudflare server and observed the same behavior.

root@OpenWrt:~# uci show stubby
stubby.global=stubby
stubby.global.manual='0'
stubby.global.trigger='wan'
stubby.global.tls_query_padding_blocksize='128'
stubby.global.appdata_dir='/var/lib/stubby'
stubby.global.edns_client_subnet_private='1'
stubby.global.idle_timeout='10000'
stubby.global.listen_address='127.0.0.1@5453' '0::1@5453'
stubby.global.round_robin_upstreams='0'
stubby.global.log_level='7'
stubby.global.dns_transport='GETDNS_TRANSPORT_TLS' 'GETDNS_TRANSPORT_UDP' 'GETDNS_TRANSPORT_TCP'
stubby.@resolver[0]=resolver
stubby.@resolver[0].address='1.1.1.1'
stubby.@resolver[0].tls_auth_name='cloudflare-dns.com'
stubby.@resolver[1]=resolver
stubby.@resolver[1].address='1.0.0.1'
stubby.@resolver[1].tls_auth_name='cloudflare-dns.com'

sangs303 avatar Aug 31 '23 12:08 sangs303

As per the document,

* round_robin_upstreams: Round-robin queries across all the configured upstream servers. Without this option, Stubby will use each upstream server sequentially until it becomes unavailable and then move on to use the next.

But it does not behave like mentioned. why? Please suggest.

root@OpenWrt:~# logread -f | grep stubby
Thu Aug 31 12:46:40 2023 daemon.err stubby[9661]: [12:46:40.746907] STUBBY: 1.0.0.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 12:46:40 2023 daemon.err stubby[9661]: [12:46:40.746941] STUBBY: 1.0.0.1                                  : Upstream   : TLS - Resps=    11, Timeouts  =     0, Best_auth =Success
Thu Aug 31 12:46:40 2023 daemon.err stubby[9661]: [12:46:40.746954] STUBBY: 1.0.0.1                                  : Upstream   : TLS - Conns=    11, Conn_fails=     0, Conn_shuts=      9, Backoffs     =     0
Thu Aug 31 12:46:41 2023 daemon.err stubby[9661]: [12:46:41.757714] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 12:46:41 2023 daemon.err stubby[9661]: [12:46:41.856423] STUBBY: 1.1.1.1                                  : Verify passed : TLS
Thu Aug 31 12:46:51 2023 daemon.err stubby[9661]: [12:46:51.883197] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 12:46:51 2023 daemon.err stubby[9661]: [12:46:51.883233] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=    10, Timeouts  =     0, Best_auth =Success
Thu Aug 31 12:46:51 2023 daemon.err stubby[9661]: [12:46:51.883246] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    10, Conn_fails=     0, Conn_shuts=      9, Backoffs     =     0
Thu Aug 31 12:46:52 2023 daemon.err stubby[9661]: [12:46:52.894108] STUBBY: 1.0.0.1                                  : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 12:47:03 2023 daemon.err stubby[9661]: [12:47:03.013683] STUBBY: 1.0.0.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 12:47:03 2023 daemon.err stubby[9661]: [12:47:03.013718] STUBBY: 1.0.0.1                                  : Upstream   : TLS - Resps=    12, Timeouts  =     0, Best_auth =Success
Thu Aug 31 12:47:03 2023 daemon.err stubby[9661]: [12:47:03.013731] STUBBY: 1.0.0.1                                  : Upstream   : TLS - Conns=    12, Conn_fails=     0, Conn_shuts=     10, Backoffs     =     0
Thu Aug 31 12:47:04 2023 daemon.err stubby[9661]: [12:47:04.024454] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Opportunistic Profile
Thu Aug 31 12:47:14 2023 daemon.err stubby[9661]: [12:47:14.189701] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)= 10000
Thu Aug 31 12:47:14 2023 daemon.err stubby[9661]: [12:47:14.189735] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=    11, Timeouts  =     0, Best_auth =Success
Thu Aug 31 12:47:14 2023 daemon.err stubby[9661]: [12:47:14.189749] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    11, Conn_fails=     0, Co
```nn_shuts=     10, Backoffs     =     0

sangs303 avatar Aug 31 '23 12:08 sangs303

I think the issue here is the idle_timeout of 10000, which is long enough that the upstream is actually shutting the connection each time (instead of stubby) - see the Conn_shuts= value in the logs. This triggers the unavailable criteria for stubby and causes it to move to the next upstream. You want this idle_timeout value as high as possible for efficiency but some resolvers apply limits server side e.g. Cloudflare

If you look in the config file you will see this comment:

## Cloudflare 1.1.1.1 and 1.0.0.1
## (NOTE: recommend reducing idle_timeout to 9000 if using Cloudflare)

Which is because of this behaviour. I see the following log if I do reduce idle_timeout to 9000:

[16:47:32.623206] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[16:47:32.882675] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[16:47:41.912256] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[16:47:41.912272] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=     1, Timeouts  =     0, Best_auth =Success
[16:47:41.912275] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=     1, Conn_fails=     0, Conn_shuts=      0, Backoffs     =     0
[16:47:43.924166] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[16:47:43.990580] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[16:47:53.060844] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[16:47:53.060860] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=     2, Timeouts  =     0, Best_auth =Success
[16:47:53.060863] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=     2, Conn_fails=     0, Conn_shuts=      0, Backoffs     =     0
[16:47:55.070885] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[16:47:55.104292] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[16:48:04.174794] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[16:48:04.174812] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=     3, Timeouts  =     0, Best_auth =Success
[16:48:04.174815] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=     3, Conn_fails=     0, Conn_shuts=      0, Backoffs     =     0
[16:48:06.187998] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile

Please try this modification and it should change the behaviour with Cloudflare. You will have to experiment to see what is needed with your other resolvers.

saradickinson avatar Sep 01 '23 16:09 saradickinson

Thanks, it working fine. Let me try with all scenerio

sangs303 avatar Sep 04 '23 06:09 sangs303