Follow the DNS

It is no longer “trending”, but at the dawn of the millennium, the increasing globalization together with the rise of modern technology and especially the Internet gave birth to the term “Follow the Sun”. For the young or old and forgetful, here is what it was all about. For example, while online services that usually require continuous operation and worldwide accessibility at any given time, a service may stop working or become inaccessible to some users. Anytime. How to provide technical support for such service without forcing employees to be awake at night in a certain time zone? Spread the workers around the world so that you always have someone who has daytime (the Sun over their head) and can provide support for the online service. And if the worker can’t solve the issue, they would pass it to the next one in the direction of the moving sun, who would finish the job. The fact that the time needed to solve the request was not measured in hours, but in the number of revolutions of the request around the Earth, is not so important.

What made me remember that? This year we experienced something similar while strengthening the DNS infrastructure for the .CZ domain. But let’s start from the beginning. After significant upgrades of the anycast for the .CZ domain in previous years (see for example this article), when we put into operation two DNS nodes, each with a capacity of 100 Gbps to NIX.CZ and the corresponding performance of the involved DNS stacks, we promised to change the subsequent procedure. Namely, to focus in the future on strengthening our DNS anycast in the undermaintained points with high traffic and high latency of DNS traffic. In order to be able to select such locations well, we have started to use DNS traffic analyses created thanks to the ADAM project and also the outputs of the master’s thesis of my colleague Lukáš which, addres the issue.

Simply put, using the real DNS traffic of the entire .CZ zone, we can calculate where it will be advantageous to build another DNS node so that a large part of the traffic is rerouted there and help significantly reduce communication time of the DNS resolver with our authoritative server, i.e. the above-mentioned latency. Lukáš described the method in more detail in his presentation at IT 20. In the spring of 2020, his analyses revealed that it would be appropriate to handle DNS traffic of the .CZ zone originating in the South, Southeast and East Asia region by connecting a DNS node to the SGIX peering node in Singapore, and that the dismantled node in the western United States should be replaced by an installation in Seattle and its connection to the local SIX.

To finish my story about the Sun, in the summer we solved the worsened availability of one of the four CZ DNS anycasts from the Czech Republic by installing a DNS node in Bratislava and connecting it to NIX.SK and NIX.CZ. It so happened that around one or two in the morning we could start arranging the connection of servers in Singapore, our normal working hours were “filled with Bratislava” and then closed the circle with Seattle in the evening. Of course, it’s a kind of exaggeration, but it’s hard not to respond to an e-mail when you know that postponing the answer until tomorrow means an entire day of delay. All the more so because arranging the purchase, delivery, installation and connection of servers remotely is, to put it mildly, complicated.

That would be all for my blogpost; I explained how we “Followed the Sun and DNS” in the summer and fall. But decent stories are supposed to have a good ending, and since I have one for mine, why not dish it out. The picture below shows the change in the situation with the availability of our .CZ anycast DNS between 14 days in October 2019 and 4 days in November 2020. We realize that this is a relatively inaccurate view, because the periods differ in length, but already these first data show that the weighted RTT (Round-Trip-Time – time required for mutual communication of the recursive and authoritative servers) has been significantly reduced both in SE Asia (and Asia in general) and in Europe. RTT did not change significantly in the US, which was the desired result, because last October, instead of a DNS node in Seattle, we had a node in Redwood City, California, i.e. also on the West Coast. Yes, RTT in Micronesia worsened year-on-year, but given the size of the local DNS traffic on the .CZ domain, this is an acceptable loss. The thing is, in addition to reducing RTT, we evaluate the costs incurred for all installations.

And for a better demonstration, I will list all DNS nodes that we have added or upgraded this year. In my article, I omitted part of the work for a smoother transition to “Following the Sun”. In all cases, these are installations of physical servers, and as can shown in the table, they are all different. Yes, we really care about diversity in DNS.

Location Anycast IX / IP tranzit OS DNS BGP Server CPU Comment
Seattle, WA, USA A 40G / 10G Ubuntu 20 KNOT (XDP) FRR 5xDELL AMD replacement for Restwood, CA, USA
Frankfurt, Germany D 10G / 1G Ubuntu 20 NSD FRR 1xHPE AMD upgrade of the Interxion location
Prague, Czech Republic C 10G / n/a Debian 10 KNOT BIRD 1xDELL Intel Local Anycast CESNET
Bratislava, Slovakia C 10G / n/a Ubuntu 20 Bind BIRD 1xDELL Intel
Singapore B 10G / 1G Debian 10
Ubuntu 20
NSD
Bind
FRR
FRR
2xDELL Intel
Private A 10G / 10G Ubuntu 20 KNOT (XDP) BIRD 1xDELL Intel first XDP instance
Author:

Leave a comment

All fields are required. Email won't be shown.