pdns
pdns copied to clipboard
OPCODE in dnsdist
- Program: dnsdist
- Issue type: Feature request
Short description
Make OPCODE more visible in dnsdist. Show opcode in for example grepq("2000ms")
Usecase
Usecase is partly linked with issue https://github.com/PowerDNS/pdns/issues/10624
We are running dnsdist as resolver (backend is Pdns rec) for our customers. When you start using dnsdist you issue showServers() quite often to see if everything looks ok.
The first thing you notice is that Drops are increasing and wonder whats happening.
showServers() Name Address State Qps Qlim Ord Wt Queries Drops Drate Lat TCP Outstanding Pools 0 rec01 xxx.160.127.242:53 up 1766.9 0 1 1 804515002 3269144 2.0 18.6 1253.1 75 1 rec02 xxx.160.127.243:53 up 1000.9 0 1 1 805428330 3276349 2.0 51.4 2654.3 79 2 rec03 xxx.208.42.18:53 up 810.9 0 1 1 588556509 2756507 2.0 70.0 1085.7 75 3 rec04 xxx.208.42.19:53 up 1178.9 0 1 1 584137674 2746558 2.0 33.4 1636.4 76 All 4754.0 2782637515 12048558
Then you issue grepq("2000ms") and see this; (List is shortend, it about 50-100 entries with google.com)
-1.1 xxx.122.132.90:42761 DoUDP xxx.160.127.242:53 9574 google.com. A T.O RD No Error. 0 answers -1.1 xxx.8.60.168:56943 DoUDP xxx.160.127.242:53 5063 google.com. A T.O RD No Error. 0 answers -1.1 xxx.64.111.10:52370 DoUDP xxx.208.42.18:53 17746 google.com. A T.O RD No Error. 0 answers -1.1 xxx.64.111.10:38445 DoUDP xxx.208.42.18:53 37625 google.com. A T.O RD No Error. 0 answers -1.1 xxx.8.60.168:56943 DoUDP xxx.208.42.18:53 5063 google.com. A T.O RD No Error. 0 answers
When you see this the first time its quite easy to come to the conclusion that something is broken. Its not easy to understand/know that queries are dropped by the pdns-recursor because OPCODE=2.
Description
Dont know the best solution but maybe;
- Have separate counters for OPCODE drops.
- Update doc with example how to DROP OPCODE=2 queries in dnsdist.
- Update doc with example to rewrite OPCODE=2 to OPCODE=0 in dnsdist.
- If the solution is not in dnsdist maybe revisit https://github.com/PowerDNS/pdns/issues/10624
Hi!
I understand the issue, but I'm unsure what the solution is.
1. Have separate counters for OPCODE drops.
This seems too specific to me, if we go this way I'm afraid we will end up with so many counters that it's impossible to know what's going on. It might even impact the performance when collecting metrics.
2. Update doc with example how to DROP OPCODE=2 queries in dnsdist.
Sure, I would merge a pull request adding a exemple to https://dnsdist.org/rules-actions.html#OpcodeRule
3. Update doc with example to rewrite OPCODE=2 to OPCODE=0 in dnsdist.
I would really advise against doing something like that, dnsdist tries very hard to not rewrite queries or responses.
4. If the solution is not in dnsdist maybe revisit [Make recursor reply to queries with OPCODE=2 #10624](https://github.com/PowerDNS/pdns/issues/10624)
Well, it would certainly make the metrics in dnsdist look better, but on the other hand it would mean spending resources to generate a send a response in the recursor, and resources to forward the response in dnsdist, all for a response that is not going to be useful to the client. In theory the solution would be finding out who is sending these non-sense queries and kindly ask them to stop doing that, but that's probably not going to happen. So perhaps we could add an option to the recursor, but this needs to be discussed in #10624.
- add an opcode column to grepq output?
Technically we can, but the amount of information there is already huge enough that it often doesn't fit in a terminal line, and my completely unscientific feeling is that it's going to be useful 0.01% of the time.
We drop these in (early) rules, something like:
addAction(NotRule(OpcodeRule(DNSOpcode.QUERY)), DropAction())
We drop these in (early) rules, something like:
addAction(NotRule(OpcodeRule(DNSOpcode.QUERY)), DropAction())
Hmm yeah this looks like a good solution, but when i try to execute in dnsdist I get the following error;
addAction(NotRule(OpcodeRule(DNSOpcode.QUERY)), DropAction()) Error: [string "return addAction(NotRule(OpcodeRule(DNSOpcode..."]:1: Unable to convert parameter from nil to m stack traceback: [C]: in function 'OpcodeRule' [string "return addAction(NotRule(OpcodeRule(DNSOpcode..."]:1: in main chunk>
It's DNSOpcode.Query
see https://dnsdist.org/reference/constants.html#dnsopcode
So
addAction(NotRule(OpcodeRule(DNSOpcode.Query)), DropAction())