Building a DNS resolver in Go from scratch - Part 1
Introduction
DNS is one of those systems that are so ubiquitous and hidden in plain sight. In 2018 Google's DNS (8.8.8.8) was handling over a trillion queries per day (20M/sec). If distributed systems had Ballon d'Ors in my opinion the global DNS system claims it easy (sry Vini).
Essentially DNS is a translation/lookup service, we ask it about a URL and it replies with the IP address of that URL.
I wanted to build a simple resolver that performs DNS lookups with no dependencies using Go. In this series we'll go through that journey together.
Israel is committing a genocide against the people of Palestine.
Find out how you can help https://techforpalestine.org/learn-more/
Free Palestine 🇵🇸 End the occupation 🇵🇸
If you want to dive straight into the code start with part 2
View the full code
DNS components/terminology
Before we start, let's discuss some terms we'll come across and get a feel for how the DNS resolution process flows.
All the terminology related to DNS is explained in RFC 8499 (definitely give it a read, a very approachable reading), I will try to give a summary of the most important bits we will need in our journey.
Domain name
A domain name is basically the URL we're trying to find the IP of. It consists of a:
TLD, top-level domain (.com, .net, .org, etc..)
SLD, second-level Domain, usually referred to as domain.
Sub Domain, a nesting inside the SLD.
Nameservers
A DNS nameserver is a server that has knowledge about a bunch of domain names, or can refer us to another nameserver that knows the answer. The DNS system is designed such that the nameservers form a hierarchy.
This hierarchy, called the DNS namespace, goes from general to specific as we navigate it. A DNS query that's performed with no caching navigates this hierarchy until it finally resolves the name's address.
For example, if we're looking for the IP of stackoverflow.com
, and we ask the nameserver responsible for the .com
TLD records, this nameserver won’t know the direct answer to our question, but it can refer us to another nameserver that does. So it responds with a referral to the nameserver responsible for stackoverflow
, which we then ask to get our answer.
The DNS namespace breaks down to:
Root Nameservers.
TLD Nameservers.
Authoritative Nameservers
Root Nameservers
The root nameservers are the starting point for any DNS query, there are 13 of them -each replicated around the world multiple times- operated by different organizations. We can see the organizations running them here and server locations around the world here
Asking a root DNS nameserver about a domain name returns a response pointing to the TLD nameservers this domain is under.
TLD Nameservers
TLD nameservers know about the authority nameservers responsible for domain names under a specific TLD. Asking the .com
TLD about google.com
returns the list of authority nameservers responsible for google
domains.
TLDs are managed by the ICANN organization (Internet Corporation for Assigned Names and Number), a nonprofit organization that delegates the maintenance responsibility to various organizations. At the time of writing, there are 1591 TLDs in the root nameservers databases.
Authoritative Nameservers
The final stop, they hold the records for the domains they're responsible for. They are usually run by ISPs, domain registrars(Namecheap, GoDaddy, etc...) or big companies that maintain their own DNS servers (Google, Meta, etc...).
For example, Google’s authoritative nameserver will know the direct answers for google.com
, youtube.com
, mail.google.com
, etc…, and if a website’s DNS records are managed through a registrar then their authoritative nameserver will know the answer of where to find that website.
Resolvers
A resolver is a program or a server that extracts information from nameservers in response to client requests.
Resolution Modes: Recursive & Iterative
Resolvers and DNS servers can operate in one of two modes while serving a client's query, either iteratively or recursively.
Recursively means that the resolver will pursue the client's query on behalf of the client until it finds the answer.
For example, if we ask Google's DNS resolver about example.com
, assuming it has no previous caching of the answer, it will pursue the query for us and ask all the necessary nameservers on our behalf. This mode of operation is called recursive.
On the other hand, if we ask the root nameserver directly for example.com
, it responds with the list of TLD nameservers, expecting us to then ask them ourselves. This is an iterative resolution, because the root nameserver only sent a referral and did not find the answer for us.
This diagram pretty much sums up the DNS cycle, there might be more lookups at any given step and not only one, but the outlined flow remains the same.
What we will Build
In the next part, we'll be building a simple resolver that resolves a DNS query recursively starting from any of the root nameservers. We'll go over constructing the DNS request packet, parsing the response coming from the nameservers and acting on the response we get until we get our answer.
Cool docs/websites
Subscribe to my newsletter
Read articles from Mostafa Ahmed directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Mostafa Ahmed
Mostafa Ahmed
Free Palestine, end the genocide 🇵🇸 https://techforpalestine.org/learn-more/