What happens when you type google.com in your browser and press Enter


Concepts....

  • DNS: Stores the mapping of domain names to IP addresses.

  • TCP/IP: An important protocol that makes everything possible.

  • Firewall: A security layer.

  • HTTPS/SSL: A security and encryption layer.

  • Load balancer: Distributes the loads.

  • Web server: Delivers the content that users view in different formats, like web pages. It is where the web site lives.

  • Application server: Similar to a web server.

  • Database: Stores data such as user information.


Intro…

Have you ever wondered what happens between the moment you type a URL like google.com into a search engine and when you receive a web page as a response?

You might assume it happens automatically, but the interesting part is that a sequence of events unfolds within that brief interval.

It is very important to have at least a basic understanding of the steps involved in completing a request.

This is a sample request flow diagram:

Web Infrastructure Diagram

The search engine

We all know what a search engine is, right? Our journey begins when we enter a URL like google.com into the browser's search engine.

Every computer on the internet has an IP address, which is like a unique label identifying each device, including mobile phones, IoT devices, and servers (our main focus). When the browser gets a domain name like google.com, or a URL (which includes a domain name and extra details appended to it), it tries to find the matching IP address to know which server to communicate with, by sending a request to a DNS.

The DNS

As I mentioned earlier, to communicate with a server, you need its IP address. However, you would agree that it would be difficult for users to remember addresses like 12.34.4.988. Imagine wanting to visit YouTube and having to remember 142.250.178.174 every single time. That is definitely not user-friendly.

That's why something called a domain name was created for convenience. If an IP address is like an identity name for a server, then a domain name is like a nickname. A domain name always resolves to an actual IP address. This way, all the user needs to remember is a nice-looking nickname like google.com or youtube.com instead of 142.250.178.174.

Now, the question is how does the browser know the actual name or address of the targeted server if you only provide a nickname, i.e., a domain name? There is something called DNS, which is like a universal storage where all records of the mapping of domain names and their corresponding IP addresses are stored. So all the browser needs to do is ask the DNS, "What is the IP address of this domain name google.com?" Then the DNS responds with something like 216.58.223.228. Now the browser knows who the server is and can communicate with it.

Moving to the next step, the browser sends the user request to the server using either HTTP or HTTPS.

HTTP or HTTPS

When the browser sends a request to the server, it uses either HTTP or HTTPS.

An HTTP request is a simple, unencrypted communication between the server and the browser. This means that if someone intercepts the request, they can see all the data being sent, which is not secure.

HTTPS, however, is a more secure way to communicate. The data exchanged between the browser and the server is encrypted and can only be read by them. So, if someone intercepts the request, they only see encrypted data that they can't understand or decrypt. An SSL certificate enables this secure encryption and decryption.

Fire Wall

Any request going to the server passes through the firewall. The firewall is located on the actual server or the load balancer.

The firewall adds an extra layer of security to the server, protecting the data and preventing malicious actions. For example, if a known hacker tries to access the website hosted on the server, their actions can be blocked by blacklisting the hacker's device. This is done by using the firewall to add the hacker's device IP address to the blacklist, effectively rejecting any requests from the hacker.

Load Balancer

When the request successfully passes through the firewall, it either goes directly to the main server or to something called the load balancer.

I'm sure you've visited a site that was unusually slow. This usually happens when the server receives more requests than it can handle at that moment. Imagine a thousand users trying to access a website all at once, this would likely overwhelm any server. In the end, a server is just like your regular PC, when you try to run more applications than it can handle at once, it starts to lag or might even crash completely. The same goes for a server. These situations can be fixed by adding a load balancer to the system.

Literally, a load balancer balances load across. This is done by having multiple instances of the actual server that hosts the website. This way, there are multiple servers available for users to get the website content from. No matter how many users try to access the website, we can always increase the number of servers providing this service, so it is no longer dependent on a single server. However, here comes the issue. The browser only has the address of one server, but all the duplicate servers have their own IP addresses. So, what happens is that the IP address the browser has is actually the address of the load balancer.

The load balancer acts like a server with the sole purpose of redirecting requests to a group of identical servers that host the website, web page, or API. It directs new requests to the least busy server, effectively distributing the load across all the servers in the group.

The Web Server

The server receives the user’s request, either sent directly from the browser or passed along by a load balancer.

This is where the actual cooking happens. The server has all the ingredients and tools needed to prepare the content you’ll see in your browser. It might use NGINX to serve static files, an application server running a Django (Python web framework) app to handle business logic, a database to store and fetch data, and other essential components.

So basically, this is where all the processing occurs, and the response is prepared before being sent back to the user.

Coming Back

The processed response is sent back to the browser through the same path it took to arrive initially. This could be (web server) → (load balancer) → (firewall) → (browser) or (web server) → (firewall) → (browser), depending on the path the request originally followed.

What the user sees

Now you have a basic understanding of what happens behind the scenes when you try to visit a web page. Congratulations, you are now a techie! 😅.

Now you can probably guess what went wrong when you see an HTTP error. Take, for instance, the error page below. Just before the refresh button, you can see the error code "ERR_CONNECTION_REFUSED." This means your URL is likely correct, and it's not a network issue. Instead, there is something wrong with the server itself; maybe it was brought down, or it crashed.

Other common codes include "ERR_NAME_NOT_RESOLVED," which means your URL is most likely incorrect, and "ERR_INTERNET_DISCONNECTED," which means there is actually something wrong with your internet connection.

Conclusion

This is just an overview of the steps involved between a request and a response on the web. A typical web infrastructure might have more components or even fewer.

0
Subscribe to my newsletter

Read articles from Abdurrahman Adenowo directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Abdurrahman Adenowo
Abdurrahman Adenowo