stubby icon indicating copy to clipboard operation
stubby copied to clipboard

SERVFAIL for one domain only

Open jawadcogilent opened this issue 4 years ago • 7 comments

Hi,

I'm using stubby 0.2.6 (along with dnsmasq 2.80). Everything is working fine except for one domain name 'shapd.org.uk'.

Following is the dig output:

; <<>> DiG 9.14.8 <<>> -p 5453 shapd.org.uk
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8663
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;shapd.org.uk.                  IN      A

;; Query time: 583 msec
;; SERVER: 127.0.0.1#5453(127.0.0.1)
;; WHEN: Thu Mar 12 19:45:36 PKT 2020
;; MSG SIZE  rcvd: 30

I am using cloudflare as recursive resolvers. There is no issue if I directly query cloudflare.

Kindly help.

jawadcogilent avatar Mar 12 '20 14:03 jawadcogilent

I am seeing similar behavior when trying to resolve modmail.msft.chat

sylveon avatar Mar 15 '20 02:03 sylveon

I see intermittent SERVFAIL via Stubby for both the domains you mention with failures about 1 time in 10 for me - that is when using TLS as the transport and enabling DNSSEC validation in Stubby. I don't see any failures when using UDP or when DNSSEC validation is disabled so I think this is a performance related problem, not a pure resolution failure. Do you both have dnssec: GETDNS_EXTENSION_TRUE set?

When doing local DNSSEC validation in Stubby many more queries are required to resolve a single domain e.g. for shapd.org.uk 16 queries instead of 1 are required. Stubby doesn't yet have a cache so if one of those queries fails or times out then the entire name resolution fails. Some of the records in the domains have quite short TTLs which might exacerbate the problem. At the moment there are a few options:

  1. Increase the default timeout on queries to reduce the frequency of failures, for example use 10s instead of 5s: timeout: 10000
  2. Disable local DNSSEC validation and rely on the fact that Cloudflare does DNSSEC and you are using TLS to secure the connection
  3. Install another nameserver that can act as a local cache e.g. unbound see https://dnsprivacy.org/wiki/display/DP/DNS+Privacy+Clients#DNSPrivacyClients-Unbound/Stubbycombination

saradickinson avatar Mar 19 '20 11:03 saradickinson

I am using OpenWRT 19.07.02 on Linksys WRT1900AC. On that, I get 10 out of 10 requests failed.

Responses of options are as follows:

  1. My default timeout was already 10s.
  2. When I turned off DNSSEC validation it works without any issue.
  3. I am already using dnsmasq for caching.

I turned on DNSSEC validation in dnsmasq and everything is working ok now.

Many thanks

jawadcogilent avatar Mar 19 '20 13:03 jawadcogilent

I am using dnssec indeed.

sylveon avatar Mar 19 '20 17:03 sylveon

Should mention more things:

  • Upgraded to stubby 0.3 and getdns 1.6 recently, don't recall it happening before that.
  • My resolver is 1.1.1.1
  • I am using DNSSEC validation and DoT as well.
  • Stubby is used as a forwarding resolver for pi hole, where DNSSEC is also enabled.

sylveon avatar Mar 20 '20 06:03 sylveon

@sylveon What do you see if you do:

getdns_query -sL @1.1.1.1 modmail.msft.chat. A +dnssec_return_status +return_call_reporting

And

getdns_query -sL @1.1.1.1 shapd.org.uk. A +dnssec_return_status +return_call_reporting

wtoorop avatar Mar 20 '20 09:03 wtoorop

:~# getdns_query -sL @1.1.1.1 shapd.org.uk A +dnssec_return_status +return_call
_reporting
{
  "answer_type": GETDNS_NAMETYPE_DNS,
  "call_reporting":
  [
    {
      "idle timeout in ms": 0,
      "query_name": <bindata for shapd.org.uk.>,
      "query_to":
      {
        "address_data": <bindata for 1.1.1.1>,
        "address_type": <bindata of "IPv4">
      },
      "query_type": GETDNS_RRTYPE_A,
      "resolution_type": GETDNS_RESOLUTION_STUB,
      "responses_for_this_upstream": 8,
      "responses_on_this_connection": 8,
      "run_time/ms": 230,
      "server_keepalive_received": 0,
      "timeouts_for_this_upstream": 0,
      "timeouts_on_this_connection": 0,
      "tls_auth_status": <bindata of "None">,
      "tls_peer_cert": <bindata of 0x308205c63082054ca003020102021001...>,
      "tls_version": <bindata of "TLSv1.3">,
      "transport": GETDNS_TRANSPORT_TLS
    }
  ],
  "canonical_name": <bindata for shapd.org.uk.>,
  "just_address_answers":
  [
    {
      "address_data": <bindata for 104.18.56.67>,
      "address_type": <bindata of "IPv4">
    },
    {
      "address_data": <bindata for 104.18.57.67>,
      "address_type": <bindata of "IPv4">
    }
  ],
  "replies_full":
  [
     <bindata of 0x85898190000100030000000105736861...>
  ],
  "replies_tree":
  [
    {
      "additional":
      [
        {
          "do": 1,
          "extended_rcode": 0,
          "rdata":
          {
            "options":
            [
              {
                "option_code": 12,
                "option_data": <bindata of 0x00000000000000000000000000000000...>
              }
            ],
            "rdata_raw": <bindata of 0x000c010f000000000000000000000000...>
          },
          "type": GETDNS_RRTYPE_OPT,
          "udp_payload_size": 1452,
          "version": 0,
          "z": 0
        }
      ],
      "answer":
      [
        {
          "class": GETDNS_RRCLASS_IN,
          "name": <bindata for shapd.org.uk.>,
          "rdata":
          {
            "ipv4_address": <bindata for 104.18.56.67>,
            "rdata_raw": <bindata of 0x68123843>
          },
          "ttl": 300,
          "type": GETDNS_RRTYPE_A
        },
        {
          "class": GETDNS_RRCLASS_IN,
          "name": <bindata for shapd.org.uk.>,
          "rdata":
          {
            "ipv4_address": <bindata for 104.18.57.67>,
            "rdata_raw": <bindata of 0x68123943>
          },
          "ttl": 300,
          "type": GETDNS_RRTYPE_A
        },
        {
          "class": GETDNS_RRCLASS_IN,
          "name": <bindata for shapd.org.uk.>,
          "rdata":
          {
            "algorithm": 13,
            "key_tag": 34505,
            "labels": 3,
            "original_ttl": 300,
            "rdata_raw": <bindata of 0x00010d030000012c5e75effb5e7330db...>,
            "signature": <bindata of 0x7905d50fe8449516b6b4f5cec33cea7d...>,
            "signature_expiration": 1584787451,
            "signature_inception": 1584607451,
            "signers_name": <bindata for shapd.org.uk.>,
            "type_covered": GETDNS_RRTYPE_A
          },
          "ttl": 300,
          "type": GETDNS_RRTYPE_RRSIG
        }
      ],
      "answer_type": GETDNS_NAMETYPE_DNS,
      "authority": [],
      "canonical_name": <bindata for shapd.org.uk.>,
      "dnssec_status": GETDNS_DNSSEC_SECURE,
      "header":
      {
        "aa": 0,
        "ad": 0,
        "ancount": 3,
        "arcount": 1,
        "cd": 1,
        "id": 34185,
        "nscount": 0,
        "opcode": GETDNS_OPCODE_QUERY,
        "qdcount": 1,
        "qr": 1,
        "ra": 1,
        "rcode": GETDNS_RCODE_NOERROR,
        "rd": 1,
        "tc": 0,
        "z": 0
      },
      "question":
      {
        "qclass": GETDNS_RRCLASS_IN,
        "qname": <bindata for shapd.org.uk.>,
        "qtype": GETDNS_RRTYPE_A
      }
    }
  ],
  "status": GETDNS_RESPSTATUS_GOOD
}

jawadcogilent avatar Mar 20 '20 10:03 jawadcogilent