How the browser works?

What is a Browser? Why is it important to know about browser?

When we think of a browser, lets say Chrome, Brave, Safari, we consider it just as a piece of software. This undermines the browser’s real capabilities and diminishes the amount of attention that it deserves. It won’t be wrong to call the browser as a near-OS as it has so much capabilities like controlling network, enabling interaction, displaying and storing data, managing its own timer etc. Thus , browser has lot more to discuss than we think, which we will discuss in this article.

The functional architecture of a browser can be divided in the following parts :

  1. Data : Data is stored in the browser in the forms of local storage, cookies, session storage etc. Nowadays even the entire scripts (Javascript codes) can be stored an executed from the browser data. Memory that is responsible for the execution of programs in the browser is also a part of this section.

  2. User Interface : It is the part of the browser which is directly exposed to the end-user for display of web pages and interaction with those pages.

  3. Browser Engine : This is the most interesting and the critical part of the browser. Browser Engine is responsible for all its working. It can be further divided in the following :

    1. Rendering Engine : It is part responsible for rendering the HTML, CSS, develops the Document Object Model (DOM), paints the canvas.

    2. Javascript Engine : Javascript engine is the part which deals with the javascript execution, nowadays typically v8 engine is used. It works exactly same as your Node, Dino, Bun and other runtime environments.

    3. Network Engine : It deals with all the network related things of the web application like sending requests, receiving responses, understanding status code, dealing with network protocols like HTTPS, Web Sockets connections.

    4. Timer Engine : This is the part which deals with the timing of the web applications. The setTimeout, setInterval functions that we use in our javascript are not directly executed through the javascript engine, instead they are implemented using web apis from the browser’s timer engine. In other runtime environments like Node and Dino, similar concept is used to maintain the timing.

In this article, we will focus on the Rendering Engine of the browser as we are understanding in context of web pages.

What is HTML actually?

Most of us think HTML (Hypertext Markup language) is simply writing <h1> tags or <p> tags but it is not that simple. Ever wondered how a simple programming language (it is still a debate whether HTML is a programming language or not) which has no access to the memory, no access to the LEDs of our screens, can render the entire complex webpages containing complex animations, gradients, images, videos etc on our screen.

HTML which perhaps creates the skeleton of the web pages, is actually a parser that uses C++ under the hood to render all the webpages in our web browsers. We are writing HTML but it is not as it is going to browser in the HTML format, rather it gets converted into C++ and goes into browser in the C++ format. This can be simply understand by writing a simple code for getting all the <h2> tags in the console of any wikipedia page:

We can clearly see, we are getting a NodeList [] in return. But we never write any NodeList right, we write HTML, which gets converted to C++.

It would not be wrong to say that whatever tags we write in our .html files is actually not the reason for the rendering of web pages in our browsers, there is a lot more to understand about it , which we will discuss in this article .

How the Rendering Engine actually works?

The rendering engine, as discussed earlier is responsible for the painting of the web pages on our browser screens. It is Rendering Engine’s job to understand the HTML and generate the Document Object Model.

The two ultimate jobs of the Rendering Engine are - Displaying the content and Interaction with the content.

This process occurs in the following steps:

  • We write our .html files, which is considered to be DOCUMENT that will be rendered by the browser.

  • The initial step is loading the file into the browser. This file could be anywhere in the internet , in some server, in some database or even in the local machine. Irrespective of where this file is located, the browser always treats as a resource from “another machine” , even if it is present on the local machine. That is the reason why I have mentioned as a near-OS.

  • The file which is loaded is in the form of raw bytes, i.e., 0s and 1s in which our computer stores and understands it.

  • The next step is character encoding , i.e. converting this raw data into characters of languages like english, hindi or japanese. In order to do character encoding, standards like UTF-8 (which is backward compatible with ascii) are used .

  • Next step is tokenisation, i.e. taking language specific tokens from the characters. In case of programming languages these tokens are generally keywords like if, else, for, while . In the case of HTML these tokens are generally keywords like h1, p, html, body, head, style etc.

  • Now, these tokens are converted into OBJECTS in these formats as mentioned below :

      {
          tag : h1,
          title : something, 
          value : something
      },
      {
          tag : head,
          title : something, 
          value : something
      }
      // similarly for all tags like body, html, styles etc.
    
  • Just these objects in the gibberish format will not work. In the next step, they are arranged with respected to their relations. Here relations can be in the format like “Siblings”, “Children” and so on. This process can be called as MODELLING the OBJECTS that are created from the DOCUMENTS, thus bringing up the term Document Object Model or DOM.

  • Exact same steps are followed for the CSS files also from loading raw bytes , encoding to characters, tokenisation and arranging these tokens with respect to relations and ultimately the CSS object model or the CSSOM is created.

💡
The rendering engine works on the creation of DOM (Document Object Model). Whenever it is encountered with a <link:css> tag, it starts working simultaneously on the creation of CSSOM (CSS Object Model). Thus, we can consider that the browser engine works both on the DOM and CSSOM in a parallel order.

Now, lets take a pause and understand where we are. We have created the DOM and the CSSOM. Note that until this time, DOM and the CSSOM have not aware of each other. Here comes the concept of Render Tree comes.

When the DOM and the CSSOM is created, the browser engine starts its work with its mathematical capabilities. As the CSS deals a with a lot of screen sizes in pixels, percentages, rems and ems, mathematics to display this is also involved. Then the Browser starts rendering the elements of the DOM on the screen. This is process is called painting.

What is the role of Javascript engine in the process of rendering?

The Javascript engine plays an importantly role in the Painting process, as the <script> tag has the power to manipulate the entire DOM (as well as the CSSOM).

  • While doing the rendering process, the Browser Engine works to create the DOM. Whenever the Browser Engine encounters a <script> tag, it haults its process of creating DOM and first executes the Javascript code. DOM creating is halted to execute the Javascript code because the Javascript may entirely change the structure of the DOM.

  • Sometimes the javascript code is asynchronous written with the async keyword. In some frameworks like React.js and Next.js, this process is called hydration where the client-side JavaScript takes over the server-rendered HTML to make it interactive. Thus, it has the capability to modify the DOM.

💡
The interaction of Javascript and the CSSOM is still under studies to be understood completely. According to some experts in the field like Hitesh Choudhary who are researching on this, the Javascript execution will be halted until the CSSOM is completely ready. But this concept is under academic research and goes through a lot of debate.

Conclusion

Understanding the architecture and the working of Browser is utterly essential for a developer. It helps us to develop better applications. In this article, we have discussed the architecture of the browser, the different engines working internally in the browser, how the entire “painting” process of the webpage works and why the browser can be considered as a near-OS.

Watch Hitesh Choudhary sir’s video to understand this concept in a better

0
Subscribe to my newsletter

Read articles from Agnibha Chakraborty directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Agnibha Chakraborty
Agnibha Chakraborty