massdns icon indicating copy to clipboard operation
massdns copied to clipboard

Surprising high number of lost lookups: about 20% false negatives

Open ghost opened this issue 3 years ago • 3 comments

Hey there,

I noticed quite a lot of lost lookups with massdns. In the example below, it roughly misses about 1900 names out of 10k domains.

I am not sure what these false negatives come from though, apologies for not trying to find where the bug is in massdns.

I made a screen recording https://asciinema.org/a/415235 of the reproduction steps below:

$ docker run --rm -it --entrypoint sh blechschmidt/massdns
/massdns # apk --no-cache add git alpine-sdk curl ipython py3-dnspython
/massdns # git pull
/massdns # make
/massdns # cat > /resolvers.txt <<EOF # See https://gist.github.com/seb-elttam/af28008b092eb5bcdfede8565c55147e#file-mtresolver-py-L18
1.1.1.1
1.0.0.1
8.8.8.8
8.8.4.4
9.9.9.10
149.112.112.10
94.140.14.140
94.140.14.141
64.6.64.6
64.6.65.6
77.88.8.8
77.88.8.1
74.82.42.42
EOF

/massdns # curl -s http://s3.amazonaws.com/alexa-static/top-1m.csv.zip | unzip -p - | cut -d, -f2- \
 | head -n 10k  > /domains.txt

/massdns # time ./bin/massdns -s 10000 -q -r /resolvers.txt -o S -w /out.txt /domains.txt; \
 cat /out.txt | grep ' A [0-9]' | cut -d' ' -f1 | sort -u | wc -l
real    0m 7.07s
user    0m 0.09s
sys     0m 0.21s
7534

/massdns # time ./bin/massdns -s 1000 -q -r /resolvers.txt -o S -w /out.txt /domains.txt; \
 cat /out.txt | grep ' A [0-9]' | cut -d' ' -f1 | sort -u | wc -l
real	0m 5.03s
user	0m 0.10s
sys	0m 0.18s
7911

/massdns # git clone https://gist.github.com/seb-elttam/af28008b092eb5bcdfede8565c55147e /gist && cd /gist
/gist # ipython
Python 3.9.5 (default, May 12 2021, 20:44:22) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.23.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from mtresolver import *
   ...: r = resolve_hostnames('/domains.txt', 1000)
   ...: len(r)
06:38:03 DEBUG enter
06:38:21 DEBUG exit
Out[1]: 9847
In [2]: from mtresolver import *
   ...: r = resolve_hostnames('/domains.txt', 10000)
   ...: len(r)
06:39:07 DEBUG enter
06:39:25 DEBUG exit
Out[2]: 9850

ghost avatar May 20 '21 06:05 ghost

I get quite a lot as well. I ended up writing my script to go through the list of domains multiple times. If after 10 times it still hasn't got an IP, then chances are its dead. Not very efficient though :(

youradds avatar Jul 28 '21 05:07 youradds

I'm having same problem and inconsistency between scans as well on my digitalocean VPS.

ko2sec avatar Aug 25 '21 22:08 ko2sec

To debug the issue, I suggest the following:

  1. Clone the latest massdns version. 2b394082ea8b45b850718861185194920604e49d fixes an issue, though it is a minor one and only affects mixed resolver lists. 352187ce86b1ffa4038057a77460f4f7473ec038 changes the default response codes for which to retry queries.
  2. Use the -o Je output option and --error-log /tmp/error.log. This will log all input as well as output failures.
  3. The number of lines inside the MassDNS NDJSON output and the number of lines returned by grep -E '^Illegal|^Duplicate' /tmp/error.log should add up exactly to the number of supplied input domains. If they don't, there is a bug in MassDNS.
  4. Run jq '. | select(.error != null)' on the MassDNS output. This will show all output failures failures (e.g. due to timeouts or when the last packet received has an unacceptable return code). In case you see many TIMEOUT and MAXRETRIES errors, you hit network congestion, resolver rate limits or both.

In addition, I suggest performing reconnaissance scans for single domains against authoritative nameservers without leveraging third-party resolvers directly like so: ./bin/massdns -r <(./scripts/auth-addrs.sh example.com) --norecurse -o Je --error-log /tmp/error.log /tmp/names.txt

blechschmidt avatar Sep 23 '21 00:09 blechschmidt