When upstream is unavailable, blocky returns "Not Ready" responses for queries that do not rely on upstream
I have this Blocky setup:
upstreams:
init:
strategy: fast
groups:
default:
- 10.64.0.1
strategy: parallel_best
conditional:
fallbackUpstream: false
mapping:
local.dev: 192.168.1.1
168.192.in-addr.arpa: 192.168.1.1
10.64.0.1 is a DNS server on the far side of a WireGuard tunnel from my router. If that tunnel is offline (and Blocky thus can't reach upstream), and I query a host in my internal domain local.dev, I get this response from Blocky:
root@GatewayMax:~# dig @ns.local.dev gw.local.dev
; <<>> DiG 9.16.50-Debian <<>> @ns.local.dev gw.local.dev
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 6281
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; EDE: 14 (Not Ready)
;; QUESTION SECTION:
;gw.local.dev. IN A
;; Query time: 10 msec
;; SERVER: 192.168.1.7#53(192.168.1.7)
;; WHEN: Sun Nov 10 10:59:35 AEDT 2024
;; MSG SIZE rcvd: 54
It appears that, because the primary upstream is unavailable, Blocky is refusing to answer queries, before checking if it actually needs upstream to respond to this query (it doesn't).
Expected behaviour: Blocky recognises that this query should be forwarded to 192.168.1.1 and does so, even though upstream is down.
Possibly relevant: local.dev is a placeholder. I actually use a real domain that has public authoritative nameservers for my internal network, so I can complete ACME DNS challenges. But the A records for my internal hosts aren't on the authoritative nameservers, only on 192.168.1.1.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Still very much an issue, unfortunately.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Keepalive. (Closing bug reports after 90 days is very annoying behaviour.)
Can confirm this happens and it is annoying, all my homelab services go down when the upstream becomes unreachable even though I have my domains on local resolution
I had an internet outage today and based on this issue my clients were not able to resolve local dns entries. All my local services stopped working for hours until the internet outage was fixed. Blocky only continued resolving customDNS entries when the public upstream DNS server was available.
@0xERR0R Sorry for pinging, not sure if you have seen the issue yet. In my point of view it's a critical issue, since from a client perspective Blocky stops working without upstream access. Hope it can be fixed.
I tried to reproduce it with following config:
upstreams:
init:
strategy: fast
groups:
default:
- 200.200.200.200
strategy: parallel_best
conditional:
fallbackUpstream: false
mapping:
local.dev: 192.168.178.1
168.192.in-addr.arpa: 192.168.178.1
upstream 200.200.200.200 is not reachable.
´dig @localhost gw.local.dev´ returns NXDOMAIN (as expected and sends query to 192.168.178.1). ´dig @localhost example.com ´ ends with timeout
Does this error still occur in the latest version of blocky?
I just attempted to reproduce the issue against the v0.26.2 docker image and wasn't able to do so. Given the age of the initial report I was probably running v0.24 when I filed it so it may well have been fixed in one of the intervening versions.
@0xERR0R I just had an unexpected internet outage and took the opportunity to retest this issue. Here's the dig output:
; <<>> DiG 9.10.6 <<>> @192.168.1.7 [redacted.localdomain]
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 233
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; OPT=15: 00 0e ("..")
;; QUESTION SECTION:
;[redacted.localdomain]. IN A
;; Query time: 7 msec
;; SERVER: 192.168.1.7#53(192.168.1.7)
;; WHEN: Thu Aug 14 18:14:33 AEST 2025
;; MSG SIZE rcvd: 57
This is running blocky v0.26.2 in a Docker container.
My log is full of lines like this, with all the upstream queries timing out:
[2025-08-14 18:13:36] ERROR error on processing request:upstream 'tcp+udp:10.64.0.1': can't resolve request via upstream server tcp+udp:10.64.0.1 (10.64.0.1:53): read udp 172.29.8.2:44390->10.64.0.1:53: i/o timeout client_ip=192.168.1.42 question=A (api.dropboxapi.com.) req_id=6ea5198a-5de6-4ce1-9376-029faf1dedde
However, there is no log line matching the query for [redacted.localdomain]. I don't know what to make of that.
When I have some time, I'll make another attempt to construct a set of lab conditions to reproduce the issue.