During the development of the DNS Knot Resolver, CZ.NIC Labs have managed to reveal a security flaw that makes it possible to bypass DNSSEC security on F5 load balancers and cause denial of service. These products are being used, for example, in some internet banking applications, including those of Czech banks and public authorities. From the perspective of a user attempting to access an internet banking service, a successful attack exploiting this error would manifest in the browser suddenly reporting an “address not found” error and the service becoming unavailable.
The vendor (F5) was informed about the error in August 2018 and now it has released the recommended configuration to workaround the problem. As the operators of DNS resolvers are already encountering the bug in normal operation, we are publishing a detailed description of the error to inform the professional public and raise awareness of the problem.
The error that we discovered when testing the new version of our Knot Resolver was a technical detail hidden in the software of the company F5, which contained a faulty implementation of DNSSEC.
The first part of this article contains the description of the error and its implications, while the second part deals with the way it can be abused, and offers guidance in conclusion.
The main message is this: If you are using the F5 load balancer, contact the vendor as soon as possible and request fix for the error that has been assigned Bugzilla ID 744937. Until the fix becomes available, use the procedure described in the vendor’s article.
Where is the error?
The whole story can be told in one sentence: “The cause of the error was the failure to comply with the technical standard RFC5155 section 3.1.8“, probably as a consequence of the effort to simplify the implementation.
In order for the readers to take in the entire error principle and its implications, we will illustrate the error on the example in which a user tries to connect to a web server www.example.com, which is using the load-balancer. In order to connect, first the web browser must convert the domain name “www.example.com” into an IPv4 or IPv6 address. To convert, the browser would send two queries at once to the DNS resolver:
- www.example.com. AAAA (record for an IPv6 address; let’s say that this address does not exist)
- www.example.com. A (record for an IPv4 address; this address exists in our example)
The order in which queries are handled by the DNS resolver is basically random. In the following example, we describe a scenario where the resolver first processes the query for an IPv6 address (the DNS record of the AAAA type). The left column describes the behavior of the correct implementation and the right column shows the error that was just recently present in the F5 load balancers:
Correct implementation | F5 implementation error5 | |
1. | The web browser sends a query for www.example.com to the DNS resolver. AAAA (the AAAA record type is an IPv6 address). | – same – |
2. | The DNS resolver does not have a response in the cache because the user opens the page for the first time, so the DNS resolver queries the authoritative server hidden behind the load balancer. | – same – |
3. | Because there is only an IPv4 address (type A) in our hypothetical example, and no IPv6 address (type AAAA), the load balancer has to send back “DNSSEC proof of non-existence”. | – same – |
4. | The load balancer sends back a response containing the correct proof of non-existence for the record www.example.com. AAAA: | The load balancer sends back a response containing the incorrect proof of non-existence for the record www.example.com. AAAA: |
MIFDNDT3NFF3OD.example.com. NSEC3 1 0 0 – MIFDNDT3NFF3OE A
At the end of the proof there is a list of record types that exist on the domain name www.example.com: In our case, these are only type A (IPv4 address) records. Therefore, the proof is correct and the DNS resolver receives the signal: There is no AAAA here, we only have A. |
MIFDNDT3NFF3OD.example.com. NSEC3 1 0 0 – MIFDNDT3NFF3OD TXT
At the very end of the proof there is an incorrect part, which says that the name www.example.com contains only TXT-type records. This also means that the query for type A is meaningless – from the given proof, the DNS resolver would have received the wrong information that it does not exist. |
|
5. | The resolver saves the proof of non-existence into the cache. | – same – |
6. | Now, the resolver handles in the same way the query www.example.com. A (IPv4 address). | – same – |
7. | From the previous response, the DNS resolver already knows that A record exists, but does not yet know its value. In other words, in the cache it has a proof that there is a record www.example.com. A, which is not yet in the cache, so it will query the authoritative server. | The DNS resolver has a proof in the cache that there is no record www.example.com. A. |
8. | The resolver accepts the A record from the authoritative server, validates the DNSSEC signature and sends the record to the web browser. | Based on the erroneous information obtained in step 4, the resolver immediately responds to the web browser that the IPv4 address record does not exist. |
9. | The web browser has an IPv4 address and can connect to the website. | The web browser has no IP address and cannot connect to the website. |
Denial-of-service attack
The error described above can be exploited by an attacker to perform a denial-of-service (DoS) attack in two ways:
- A simple querying of the DNS resolver with the support for aggressive caching. All the attacker has to do is to query for the target name and any non-existent type. One query will thus cause the unavailability of the target domain and its associated services for all users of that DNS resolver, typically e.g. all customers of the ISP in question.
- In cases where the attacker cannot query the resolver or the resolver does not support aggressive caching, a long-known technique of DNS spoofing can be used. In this case, the attack is carried out in two phases – preparatory and offensive –, which are described below:
Preparing for response spoofing
- The attacker sends to a vulnerable load balancer a DNS query for the target domain and a non-existent type, e.g.
www.example.com. AAAA (or any other non-existent type, e.g. HINFO) - The vulnerable load balancer responds with a proof of non-existence that, as in the previous example, contains incorrect information but is validly signed by the domain signing key:
MIFDNDT3NFF3OD53O7TLA1HRFF95JKUK.example.com. NSEC3 1 0 0 – MIFDNDT3NFF3OD53O7TLA1HRFF95JKUL TXT - The attacker saves the incorrect proof of non-existence for later use.
Attack by response spoofing
- A victim queries for the target domain and an existing type, e.g.
www.example.com. A - The attacker passes the previously received response instead of the “genuine” response from the load balancer.
- The spoofed proof of non-existence seemingly meets all the requirements of the DNSSEC standard and proves that type A does not exist, and is therefore stored in the resolver’s cache.
- Based on the spoofed proof of non-existence, the resolver responds to the victim that the requested record does not exist.
- The victim cannot connect to the target website.
How did we reveal the error?
In CZ.NIC Labs, we develop our own DNS resolver and we want it to be one of the most effective in the market. That is why last year, we added to our software the support of so-called “aggressive caching with NSEC3 support”, which is not available in any of the competing software at the moment. This is a new technique described in RFC 8198, which allows a DNS resolver to use all of the secure DNSSEC information and respond to some queries directly from the resolver cache (see steps 7 and 8 above, in the “Where is the error?” section). When we deployed this extension, we found out that although DNS queries were responded more efficiently for users, some of them complained about unexplainable problems when connecting to websites of some banks and public authorities, and what is worst, “the problem disappeared after a while”, which made it very difficult to analyze.
We analyzed the problematic queries in detail and revealed that the problem is not in our implementation of the standard, but is on the side of the authoritative server (more precisely, the load balancer) and that it only appears when the queries are put “in an unfortunate order”. The problem then persists until the expiration of records in the DNS resolver cache. We reported the bug to the developer and waited for the fix. Initially, the vendor F5 did not promise any dates for fixes, but when we described the potential for abuse, after a few months it published a tutorial on how to work around the error by changing the configuration.
Conclusion
We hope to have been able to illustrate how dangerous it might be to use “shortcuts” in the development that go against the standard. Any such non-standard shortcut brings the risk of security vulnerabilities.
In this case, paradoxically, the affected systems were those of institutions that users consider to be the safest and that actively invest in “defense” using solutions from external vendors. Well, the sub-contractors also must adhere to technical standards.
If you want to learn more about DNSSEC, we invite you to a course at the CZ.NIC Academy.
Hi, Can you please point me to the release notes associated with the feature associated with RFC 8198 ?
Hi, aggressive caching was implemented in Knot Resolver version 2.0.0 (NSEC only) and NSEC3 support was added in 2.4.0. You can find the release note in “New Features” sections of these versions:
https://knot-resolver.readthedocs.io/en/stable/NEWS.html#knot-resolver-2-0-0-2018-01-31
https://knot-resolver.readthedocs.io/en/stable/NEWS.html#knot-resolver-2-4-0-2018-07-03
If you’re interested in more implementation details, you can also refer to issue #108 https://gitlab.nic.cz/knot/knot-resolver/-/issues/108