Are homogenic nameserver names a single point of failure?

The golden rule of security, stability, and resiliency of virtually anything is “don’t put all your eggs into one basket”. This generally applies to the DNS, and there are some recommendations to avoid having all your nameservers in one domain. I would like to show that in this case, this is not a silver bullet, it depends on many conditions, and using different domains in your nameserver set might even make things not better, but worse.

Let me say at first that there might be other reasons to spread domains used in the nameserver sets, so there’s no hard good or bad in this area, but always balancing the risks.

Let’s see some real world examples:

Amazon Cloudfront spreads domains used for nameservers quite a lot:


cloudfront.net. 172800 IN NS ns-666.awsdns-19.net.
cloudfront.net. 172800 IN NS ns-418.awsdns-52.com.
cloudfront.net. 172800 IN NS ns-1597.awsdns-07.co.uk.
cloudfront.net. 172800 IN NS ns-1306.awsdns-35.org.

and there might be a very good reasons for that, but we are going to look at this purely from the DNS Resolver point of view, and the work that has to be done to resolve a simple domain name hosted on such service. I am going to use a (fake) domain example.udp53.cz to demonstrate that such setup might lead to more DNS queries and increased latency.

Imagine a DNS Resolver with empty cache. A DNS client asks such resolver for AAAA record for example.udp53.cz.


$ kdig IN AAAA example.udp53.cz.

A very simple question, it seems? So, what would be the chain of queries the DNS Resolver must make to resolve the name? (I have deliberately stripped most kdig output in the examples to make this post shorter.)

1. Every DNS Resolver is primed with addresses of root nameservers (well, at least one is needed), so that will be used:


$ kdig AAAA example.udp53.cz. @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
cz. 172800 IN NS a.ns.nic.cz.
[...]
;; ADDITIONAL SECTION:
a.ns.nic.cz. 172800 IN A 194.0.12.1
a.ns.nic.cz. 172800 IN AAAA 2001:678:f::1
[...]

2. Good, there are GLUE records in the ADDITIONAL SECTION we can use, so the next step would be to ask one of the .cz nameservers:


$ kdig AAAA example.udp53.cz. @2001:678:f::1
;; AUTHORITY SECTION:
udp53.cz. 3600 IN NS trubka.network.cz.
udp53.cz. 3600 IN NS master.dns.rocks.
;; ADDITIONAL SECTION:
trubka.network.cz. 3600 IN A 81.91.84.116
trubka.network.cz. 3600 IN AAAA 2001:1568:b:145::1
trubka.network.cz. 3600 IN AAAA 2001:1568:b::145
[...]

3. And now the DNS Resolver might pick either of the two nameservers. Let’s pick the worse of the two for the latency: master.dns.rocks and we are back at the root zone:


$ kdig AAAA master.dns.rocks. @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
rocks. 172800 IN NS demand.beta.aridns.net.au.
rocks. 172800 IN NS demand.alpha.aridns.net.au.
rocks. 172800 IN NS demand.delta.aridns.net.au.
rocks. 172800 IN NS demand.gamma.aridns.net.au.
;; ADDITIONAL SECTION:
demand.alpha.aridns.net.au. 172800 IN A 37.209.192.7
[...]

4. Um, wait — an .au zone? What is this madness? We can use these specific GLUE records, but there are cases where the GLUE could not be trusted, so I am going to pretend we need to resolve them here (for example if the DNS resolver is very strict and believes only in-domain GLUEs):


$ kdig AAAA demand.alpha.aridns.net.au. @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
au. 172800 IN NS a.au.
[...]
;; ADDITIONAL SECTION:
a.au. 172800 IN A 58.65.254.73
a.au. 172800 IN AAAA 2407:6e00:254:306::73
[...]

5. We still don’t have a name for demand.alpha.aridns.net.au name:


$ kdig AAAA demand.alpha.aridns.net.au. @2407:6e00:254:306::73
;; AUTHORITY SECTION:
net.au. 86400 IN NS x.au.
[...]
;; ADDITIONAL SECTION:
x.au. 86400 IN A 37.209.194.5
x.au. 86400 IN AAAA 2001:dcd:2::5
[...]

6. And next step:


$ kdig AAAA demand.alpha.aridns.net.au. @2001:dcd:4::5
;; AUTHORITY SECTION:
aridns.net.au. 14400 IN NS ari.alpha.aridns.net.au.
[...]
;; ADDITIONAL SECTION:
ari.alpha.aridns.net.au. 14400 IN AAAA 2001:dcd:1::2
ari.alpha.aridns.net.au. 14400 IN A 37.209.192.2
[...]

7. And next — Finally we have IP address for demand.alpha.aridns.net.au!


$ kdig IN AAAA demand.alpha.aridns.net.au. @2001:dcd:1::2
;; ANSWER SECTION:
demand.alpha.aridns.net.au. 172800 IN AAAA 2001:dcd:1::7

8. And we can return back to resolving master.dns.rocks DNS chain:


$ kdig IN AAAA master.dns.rocks. @2001:dcd:1::7
;; AUTHORITY SECTION:
dns.rocks. 86400 IN NS trubka.network.cz.
dns.rocks. 86400 IN NS master.dns.rocks.
;; ADDITIONAL SECTION:
master.dns.rocks. 86400 IN AAAA 2a01:5f0:c001:113:a::10
master.dns.rocks. 86400 IN A 89.187.130.10

9. Well, guess what — The DNS Resolver can again pick one of the two nameservers here, but let’s go easy this time and follow master.dns.rocks because we have received GLUE records for the name together with delegation nameservers, so we can finally ask for example.udp53.cz:


$ kdig AAAA example.udp53.cz. @2a01:5f0:c001:113:a::10
;; ANSWER SECTION:
example.udp53.cz. 60 IN CNAME example.udp53.cz.s3-website-us-east-1.amazonaws.com.

10. Lovely! We have just processed 9 DNS queries and responses to be redirected back at .com level. I am going to just list the queries to get to the final nameservers. Notice that we would be doomed on IPv6-only network, as nameservers for amazonaws.com are Legacy IP(v4) only.


$ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
com. 172800 IN NS a.gtld-servers.net.
[...]


$ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @2001:503:a83e::2:30
;; AUTHORITY SECTION:
amazonaws.com. 172800 IN NS u1.amazonaws.com.
amazonaws.com. 172800 IN NS u2.amazonaws.com.
amazonaws.com. 172800 IN NS r1.amazonaws.com.
amazonaws.com. 172800 IN NS r2.amazonaws.com.
;; ADDITIONAL SECTION:
u1.amazonaws.com. 172800 IN A 156.154.64.10
u2.amazonaws.com. 172800 IN A 156.154.65.10
r1.amazonaws.com. 172800 IN A 205.251.192.27
r2.amazonaws.com. 172800 IN A 205.251.195.199


$ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @156.154.64.10
;; AUTHORITY SECTION:
s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-1133.awsdns-13.org.
s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-1919.awsdns-47.co.uk.
s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-490.awsdns-61.com.
s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-661.awsdns-18.net.

11. So, are we done yet? Not even close — as you can see, there’s another restart in the resolving, and now we can pick from four TLD variants: 1) awsdns-13.org, 2) awsdns-47.co.uk, 3) awsdns-61.com, and awsdns-18.net. Let’s pick .co.uk just to make life with the DNS more fun. How many queries do we need?


$ kdig AAAA ns-1919.awsdns-47.co.uk. @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
uk. 172800 IN NS nsa.nic.uk.
;; ADDITIONAL SECTION:
nsa.nic.uk. 172800 IN A 156.154.100.3
nsa.nic.uk. 172800 IN AAAA 2001:502:ad09::3


$ kdig AAAA ns-1919.awsdns-47.co.uk. @2001:502:ad09::3
;; AUTHORITY SECTION:
awsdns-47.co.uk. 172800 IN NS g-ns-367.awsdns-47.co.uk.
[...]
;; ADDITIONAL SECTION:
g-ns-367.awsdns-47.co.uk. 172800 IN AAAA 2600:9000:5301:6f00::1
g-ns-367.awsdns-47.co.uk. 172800 IN A 205.251.193.111


$ kdig IN AAAA ns-1919.awsdns-47.co.uk. @2600:9000:5301:6f00::1
;; ANSWER SECTION:
ns-1919.awsdns-47.co.uk. 172800 IN AAAA 2600:9000:5307:7f00::1

12. Finally, here comes the final query we have been waiting for. Or not?


$ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @2600:9000:5307:7f00::1
;; ANSWER SECTION:
example.udp53.cz.s3-website-us-east-1.amazonaws.com. 60 IN CNAME s3-website-us-east-1.amazonaws.com.

13. In the worst case, the DNS Resolver would pick something not in the cache → ns-661.awsdns-18.net and we are right back in the vicious cycle:


$ kdig AAAA ns-661.awsdns-18.net. @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
net. 172800 IN NS a.gtld-servers.net.
[...]


$ kdig AAAA ns-661.awsdns-18.net. @2001:503:a83e::2:30
;; AUTHORITY SECTION:
awsdns-18.net. 172800 IN NS g-ns-467.awsdns-18.net.
[...]


$ kdig AAAA ns-661.awsdns-18.net. @2600:9000:5301:d300::1
;; ANSWER SECTION:
ns-661.awsdns-18.net. 172800 IN AAAA 2600:9000:5302:9500::1

14. And this is the final step:


$ kdig AAAA s3-website-us-east-1.amazonaws.com. @2600:9000:5302:9500::1
;; AUTHORITY SECTION:
s3-website-us-east-1.amazonaws.com. 900 IN SOA ns-1919.awsdns-47.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

Now we have a proof that the example.udp53.cz can’t be reached over IPv6, but we don’t want to end in a bad mood, so let’s query for IPv4 address to have a nice souvenir to bring home in form of Legacy IP(v4) address:


$ kdig +norec IN A s3-website-us-east-1.amazonaws.com. @2600:9000:5302:9500::1
;; ANSWER SECTION:
s3-website-us-east-1.amazonaws.com. 5 IN A 52.216.17.18

It took us full 20 full DNS queries to resolve example.udp53.cz domain name, and even if the DNS Resolver would pick the optimal path on every step, we would still end up with 8 queries.

In the real world, the DNS Resolvers use some clever tricks to avoid some of the complexity and most of the records would be quickly cached, so the latency would not be that bad; but, there are other quirks when using domains from different TLDs.

Different TLDs means different registries that take care of the top-level domain, and while most of the registries are well maintained, it brings more points of failure into the system.

Using different TLDs doesn’t only mean different registries, but also different jurisdictions, so picking a “cool” TLD from the country with totalitarian regime might not be “cool” in the end, as they might be able to take down, hijack your domain name, or otherwise manipulate the responses from your nameserver.

You might be asking — what’s a good setup then?

There would be a several approaches. The setup with less latency on the cold-cache would be to use the in-the-domain nameservers like this:


$ kdig AAAA www.nic.cz @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
cz. 172800 IN NS a.ns.nic.cz.
cz. 172800 IN NS b.ns.nic.cz.
cz. 172800 IN NS c.ns.nic.cz.
cz. 172800 IN NS d.ns.nic.cz.
;; ADDITIONAL SECTION:
a.ns.nic.cz. 172800 IN A 194.0.12.1
b.ns.nic.cz. 172800 IN A 194.0.13.1
c.ns.nic.cz. 172800 IN A 194.0.14.1
d.ns.nic.cz. 172800 IN A 193.29.206.1
a.ns.nic.cz. 172800 IN AAAA 2001:678:f::1
b.ns.nic.cz. 172800 IN AAAA 2001:678:10::1
c.ns.nic.cz. 172800 IN AAAA 2001:678:11::1
d.ns.nic.cz. 172800 IN AAAA 2001:678:1::1


$ kdig AAAA www.nic.cz @2001:678:f::1
;; QUESTION SECTION:
;; www.nic.cz. IN AAAA
;; ANSWER SECTION:
www.nic.cz. 1800 IN AAAA 2001:1488:0:3::2

See? The answer in just two steps. However the .CZ registry is a special case because the nameservers for cz and nic.cz are shared. But even without this neat trick it would take us only 3 DNS queries to get to the result. Remember that any indirection will increase the number of DNS queries needed to get the result, increasing the number of places where something can break.

This is the optimal setup for people who are deep into DNS and care deeply about latency, and in fact, if you check Alexa Top 10 domains then most of the domains there use in-the-domain nameservers, because it’s simple and reduces the latency.

However what would be the good setup for normal domain holders? My recommendation would be to split the responsibility between more entities to reduce the risk of one being under attack (or just making a mistake, because mistakes, well, happen), e.g. pick one or two stable DNS providers that use nameservers in the same domain so your nameserver set contains 1–2 domain names. You also don’t have to worry about latency, because nameservers shared among multiple domains also mean that they will be cached very quickly.

Author:

Leave a comment