Knot Resolver is not SAD DNS resolver

The Internet is flooded with news about a new attack against DNS protocol called Side channel AttackeD DNS, or in short SAD DNS. The attack is described in detail in Cloudflare’s blog and I strongly recommend you to read it to grasp how it works and why it is novel.

In this technical article we assume readers are already familiar with SAD DNS attack and we focus specifically on impact of the new attack against our DNS resolver Knot Resolver. In short – security defenses implemented in Knot Resolver since 2015 prevent SAD DNS from being effective!

To understand why we first describe defenses implemented in Knot Resolver, how it behaves under “classical” cache poisoning attack based on naive/blind message forgery, and how it compares situation under SAD DNS attack.

Before we dive into technical details, let me add word of caution: SAD DNS is a revival of classical DNS cache poisoning, and we already have technology to protect from it – DNSSEC. If your domain is not DNSSEC-secured yet please secure it now. DNSSEC protects not only from cache poisoning but also helps with other classes of attacks like random subdomain attacks. Moreover our authoritative server Knot DNS makes it really easy to secure domains with DNSSEC.

Finally the technical details:

Knot Resolver behavior

Knot Resolver checks if incoming UDP responses from authoritative servers have the same signature (consisting of message ID, query class, query name, query type and also 0x20 CaSe raNDOmIZatioN) as the original query sent to the authoritative server. Mismatch on any field the causes Knot Resolver to ignore incoming UDP responses to re-query over TCP.
Source IP address and port number on the incoming responses are also checked but they do not really add a significant amount of entropy (and thus protection) because number of authoritative server IP addresses is typically limited to a couple, and server port is fixed on value 53.
Each DNS request from client (including attacker-controlled puppets) has hard time limit of 10 seconds. If a request is not resolved before the limit expires it is terminated and no more responses are accepted.
Single resolver instance deduplicates outbound UDP queries so multiple requests for the same (query class, query name, query type) combination do not generate more than one UDP query from a single instance.
Knot Resolver can run in multiple instances which share the same network sockets and cache – but query deduplication is not implemented across instances at the moment. Consequently when a machine running 16 instances receives 16+ queries with the same (query class, query name, query type) combination it can generate up to 16 queries to authoritative servers, each with independent source port and new random message ID. Distribution of queries from clients to instances is handled by operating system.
Knot Resolver employs query name minimization technique by default and uses NS query type.
Knot Resolver refuses to overwrite records in cache with new data not secured by DNSSEC (if the data in cache are not about to expire). This includes negative cache – information about non-existence of a record. This is countermeasure to technique described in the paper figure 7.

Assumptions for analysis

To analyze impact on Knot Resolver we assume a powerful attacker. Our attacker managed to evade all “conditional” layers of protection which can be disabled or avoided. This “downgrade” can sometimes happen automatically because it is needed for interoperability with badly implemented authoritative servers, and we assume the attacker was able to disable all of them:

Attacker can find a way to trigger compatibility fallbacks and thus disable 0x20 case randomization (decreasing entropy) and also to disable query minimization for the domain under attack, thus voiding protection from NS record overwrite. Without query minimization non-existence of certain NS records might not be cached, thus opening path for spoofed answers into cache.
Attacker can trigger cache miss query for target domain unlimited number of times, i.e. has ability to bypass cache.
Attacker can avoid all partial time limits and keep the UDP socket open up to hard limit 10 seconds.
Attacker can pin resolver to a single authoritative server’s IP address, voiding additional entropy from address selection.
Attacker has control over query distribution to individual resolver instances, e.g. by varying source port or source IP address.

To make attacker’s life even easier we assume that other conditions are also very favorable to message forgery attacks:

Attacker can prevent responses from authoritative servers from reaching resolver (e.g. by misusing Response Rate Limits or other tricks mentioned in the SAD DNS paper section 5).
UDP ephemeral port range has default configuration and contains 2^15 ports.
Global ICMP rate limit in operating system is default – constant 1000 packets per second.
Environment under attack has zero noise.

We now have the stage set for an attack and we can estimate average number of packets and time required to mount successful attack against Knot Resolver 5.2.0 using two message forgery variants:

Using classical technique without SAD DNS (“blind” variant)
Using the novel SAD attack

For simplicity we assume our attacker will get lucky after he tries half of all possible combinations.

Blind variant

Attacker does not use SAD DNS attack and thus has no information about source port and message ID. Packets with incorrect source port guesses will be dropped by operating system and might generate ICMP answers, but attacker does not see or analyze these (“blind” variant).
Packets with correct source port but mismatching query/response signature cause immediate fallback to TCP, closing UDP socket and ignoring all packets later arriving to it (see section about Knot Resolver behavior above).
As a consequence of switching to TCP after receiving first suspicious answer the attacker is forced to guess correct source port (15 bits) and message ID (16 bits) at once. This provides attacker with 31 bits of entropy to guess on first try.
Mismatch does not yield additional information thus each guess is independent and has the same probability of success = 1 / 2^31 = 0.000 000 046 %.
Attacker can flood the resolver with cache-bypassing requests and at the same time flood resolver with forged answers. We assume attacker gets lucky after he tried 1/2 of all combinations, i.e. 2^30 = roughly 1 billion (10^9) forged answers.
A sustained attack of 1000 forged packets per second will have 50 % probability of success after roughly 298 hours.
The blind attack variant can be made faster by:
- Increasing packet rate, assuming all assumptions above hold and the attack stays undetected.
- Inducing duplicate queries from each resolver instance to authoritative servers, with upper bound configured by resolver administrator (who configures number of resolver instances). Each query will use a new combination of (source port, message ID), so in practice n-instances under attack increase number of correct guesses for the attacker to n/2^31. Consequently it reduces attack complexity n-times. E.g. attacking resolver comprising of 16 instances on a single machine with 50 % probability requires 298 / 16 = 18 hours.

SAD DNS variant

Attacker uses SAD DNS attack and thus has extra information about source port. Consequently the attacker has to blindly guess only message ID.
Attacker has ability to scan 1000 source ports per second (= upper bound determined by global ICMP rate limit). This provides attacker with “source port discovery rate” 1000/2^15 = 0.03 source ports per second. In other words attacker is on average able to find a single open source port each 32.768 seconds.
As soon as attacker finds open source port he can attempt to guess message ID. Mismatch on message ID will cause immediate fallback to DNS-over-TCP, so a single open source port thus provides attacker with one guess attempt with success probability 1/2^16 = 0.0015 % per 32.768 seconds.
Reaching 50 % probability of success would require 0.5 / (1 / 2^16) = 32 768 attempts.
With an attempt lasting 32.768 seconds on average, 50 % chance of succeeding would require the same 298 hours of sustained attack with packet rate around 1000 packets per second, i.e. exactly the same as with the classical/blind attack variant.
SAD DNS attack variant cannot be made faster by higher packet rate because of global ICMP packet rate limit.
SAD DNS attack variant can be made faster by inducing duplicate queries from each resolver instance to authoritative servers, with upper bound configured by resolver administrator (who configures number of resolver instances). Each query will use new combination of (source port, message ID), so in practice n-instances under attack will increase number of open source ports n-times, making source port discovery n-times faster. Consequently it reduces time required for attack n-times. E.g. attacking resolver comprising of 16 instances with 50 % probability requires 298 / 16 = 18 hours, i.e. again the same time

Conclusion

SAD DNS attack against Knot Resolver 5.2.0 does not provide any benefit over the traditional “blind” message forgery, mainly thanks to protection provided by switching to DNS-over-TCP after receipt of first suspicious response. This defense was not considered in the paper section 8.1 Defenses.

In practice we expect SAD DNS attack against Knot Resolver 5.2.0 to perform even worse than blind message forgery because noise and other factors would make the ICMP scanning less precise, thus leading to more wasted time while searching for active source ports. Also it is unlikely that attacker will be able to bypass all additional protections at the same time, making the SAD attack even less performant.

To conclude we believe defenses currently implemented in Knot Resolver 5.2.0 are sufficient to mitigate SAD DNS attack, and even more defense based on connect()ed UDP sockets envisioned in the paper might lead to performance hit (as it requires one extra system call per packet) while not providing real additional security when compared with blind forgery.

After all, it is much more effective to secure domains using DNSSEC and get protection from whole classes of attacks, instead of spending engineering time on one-off efforts to get protection from one specific attack at a time. If you don’t know where to start have a look at our authoritative server Knot DNS – makes it really easy to secure domains with DNSSEC.

Written with help of Vladimír Čunát from Knot Resolver project.

Autor: Petr Špaček

Blog zaměstnanců CZ.NIC

@ IN SOA domains.dns.enum.mojeid.internet. nic.cz.