pdns
pdns copied to clipboard
dnsdist dynamic blocking: extend to apply just to matched query objects (dynBlockRulesGroup)
- Program: dnsdist
- Issue type: Feature request
Short description
Dynamic blocking via dynBlockRulesGroup is quite powerful but a bit of a blunt weapon because all traffic from an IP address is filtered when only certain query elements are desired to be blocked. It would be useful to have an optional ability to take action just the offending matching queries, instead of all traffic from an origin IP.
Usecase
We have many clients behind forwarders, or NAT. Abusers will often trigger blocks due to queries for PTR or ANY or TXT or a number of other triggers by interspersing their traffic with larger volumes of non-violating forwarded or NATted query data from other users. We take action based on various volumes - sending REFUSED, or truncating, or dropping traffic. However, that applies to all traffic from the IP address, catching "innocent" traffic as well as abusive traffic in the same action. It would be ideal to optionally keep serving non-abusive traffic from an IP address while penalizing the specific query objects that have triggered our dynamic block.
Ideally, it would also be useful to have rules optionally apply only to traffic that crests over the specified "seconds/rate/ratio/rcode/qtype" trigger criteria, depending on what rule is being used. So this would mean a certain volume of traffic that was underneath the limits would still be processed "normally" while any traffic (randomly selected from a ring representing those objects) that was above the rule threshold would be treated by the rule as specified (dropped, truncated, spoofed, etc.) This would mean that in mixed environments where innocent and non-innocent traffic is combined, at least some innocent traffic would still be processed while damaging traffic would be throttled. Even more appealing is that traffic that does not match the rule at all would proceed entirely un-touched, so normal queries from innocent users from the same IP address as the abuser would not be modified or blocked.
Description
The concept would be to add an additional set of specifications in the various matching classes:
destination-all: apply the action to all traffic from this IP address/netmask once triggered (current method; default)
matching-all: only apply the action to traffic from this IP address/netmask that matches the filter criteria, regardless of volume
matching-overage: apply the action to traffic from this IP address/netmask that is matching the filter criteria and is in excess of the volumes or ratios described by the rule, allowing traffic below that threshold to proceed without rule application. The calculation for which objects receive treatment is either calculated from the rate & seconds, or ratio. All objects matching the rule are considered for the next per-second calculation cycle of rate or ratio, regardless of if they were acted upon or not.
This is a difficult feature, I think, if the attempt is made to apply it to all possible dynamic block methods, since some methods are response-driven, or volumetric in ways that are post-recursive, though that may only depend on the action taken in unusual combinations. It is not impossible to apply this logic to all types of DynBlockRulesGroup objects, but would be challenging. However, there may be some "low-hanging fruit" for certain methods that could benefit from this model first while others are more time-intensive to develop.
The one that stands out is "setQTypeRate" which looks at particular types of QTYPEs incoming from clients. Currently, this is:
setQTypeRate(qtype, rate, seconds, reason, blockingTime[, action[, warningRate]])
What would be ideal would be that the action would apply ONLY to traffic matching the qtype, and not other traffic. This means that we could create a rule like this:
dbr:setQTypeRate(DNSQType.PTR, 20, 3, "Exceeded PTR rate", 120, DNSAction.Refused,10,matching-all)
In this hypothetical example, if the host/network sent 25 PTR queries per second in 3 seconds, then every PTR query after the third second would receive a REFUSED reply, but all other traffic for A/AAAA/TXT etc would remain unmodified and operating normally to that host/network.
In the example above, "10" is the warning rate, but if there was a desire to never have a warning trigger then the warning rate could be set to something higher than the trigger rate. I suppose that field could also be null? (...DNSAction.Refused,,matching-all)
Another example:
dbr:setQTypeRate(DNSQType.PTR, 20, 3, "Exceeded PTR rate", 120, DNSAction.Refused,10,matching-overage)
In this hypothetical example, if the host/network sent 40 PTR queries per second in 3 seconds, then in the next second there would be 20 queries allowed through from the host without modification, but the 21st query and onwards would receive a "REFUSED" result. The second after that, again 20 would be allowed through, but the 21st-40th would receive "REFUSED" until the timer expired. If the host continues to send large amounts of PTR queries, the counter would continually be reset though some portion (20 queries a second) of PTR queries would be allowed through every second. As with "matching-all" any other traffic such as A/AAAA/TXT traffic would not have any actions or filters applied against it - only the PTR traffic above the rate threshold would be throttled.
Other obvious applications would be using "matching-overage" to answer a certain volume of queries but drop others with the setQueryRate rule.
CAVEAT: This may be the way dnsdist is supposed to act already, and I may be entirely mis-interpreting the current model. I am only looking at our current versions (1.8.x) and directly observing with packet captures what I see when limits are crossed and actions are applied. Feel free to tell me that there are already methods to do exactly what I want here and that our config is incorrect.
I'm setting the milestone to 1.10 so I consider it early in the next development cycle, but we might decide to postpone/not implement this feature request or some parts. I like the idea, but I'm not sure how hard it would be to implement in practice.
Update: I've been calling this "Love the sinner, hate the sin" model, and I'm a bit smug about that metaphor so feel free to come up with a better one.
Here's a discussion that came up today where this type of rule would be useful to us. Home Assistant users send out bursts of PTR queries, which causes their IP to be blocked because we (Quad9) have PTR limitations. There are some ways to offset this problem, but the core issue would be best solved by blocking PTR queries and not blocking ALL queries from an origin.
https://github.com/home-assistant/core/issues/147152#issuecomment-3151859895
We need this. Another cases of Quad9 blocking wholse IP cause 1 host from this external IP abused one type of reuquest - in my case TXT and/or PTR. Rate limiting per request type is much needed as this affected whole network using this external ip - websites wont load for endusers.