In my opinion, understanding the rendering process is important for web developers, as it can help identify performance issues and optimize the display of web pages. In this article, I will explain the steps browsers take and share performance tips to keep your application running smoothly.

One term that is quite common in this area is "Critical Rendering Path".

The critical rendering path refers to the steps involved until the web page starts rendering in the browser.

The rendering path involves the following steps:

Constructing the Document Object Model (DOM) from the HTML.
Constructing the CSS Object Model (CSSOM) from the CSS.
Applying any JavaScript that alters the DOM or CSSOM.
Constructing the render tree from the DOM and CSSOM.
Perform style and layout operations on the page to see what elements fit where.
Paint the pixels of the elements in memory.
Composite the pixels if any of them overlap.
Physically draw all the resulting pixels to the screen.

1) Parse the HTML and build the DOM:

For example, when you hit the URL of a website called example.com. The first thing is the browser requests a server and downloads an HTML file. As we know data on the internet is sent as byte packets.

When the browser gets byte data of an HTML file, the browser parser first converts it into the HTML characters and then into the tokens, In the next step, Tokens are converted into nodes. Nodes are different objects with certain properties. After the nodes are created, the DOM tree is created. Just like the figure below:

While parsing the HTML content, if the parser encounters external resources like CSS and JavaScript files, then the browser needs to wait for these critical resources to download before it can complete the initial render. These resources include:

CSS in the <head> element (without media attribute)
JavaScript in the <head> element (without async or defer attribute)

When the parser encounters a CSS file, it continues working on the DOM construction while downloading the CSS file in the background. However, the browser can only display the page once the CSS resources are downloaded and the CSS object model (CSSOM) tree is ready. This means the browser blocks content from displaying or rendering until the CSSOM is prepared. In other words, CSS is render-blocking.

The behavior of JavaScript files is a bit different. When the parser encounters a JavaScript file, it stops building the DOM until the JavaScript is downloaded and parsed. This is because JavaScript might use document.write() calls (or other methods that depend on the DOM) that can change how the following markup should be parsed. This means JavaScript is parser-blocking.

Therefore, it is recommended to place JavaScript links not inside the <head> tag but at the end of the document, just before the closing </body> tag.

So, now we learned two things:

CSS is render-blocking, and JavaScript is parser-blocking.

But if we link our CSS file with the media attribute, it does not block rendering because CSS with a media attribute value that does not apply to the current viewport is ignored.

Also in the case of JavaScript, linked file with the two attributes defer and async do not block the parser from DOM construction.

`Async` and `Defer` Attribute:

The defer attribute downloads the JavaScript in parallel to parsing, and the execution of the file will be delayed until the parsing of the document is complete. If multiple files have the defer attribute, they will be executed in the order that they were discovered in the HTML.

<script type="text/javascript" src="script.js" defer />

The async attribute downloads the JavaScript while the HTML is being parsed, and the file executes as soon as it is downloaded. This can happen during or after the parsing process, so the order of execution for async scripts is not guaranteed.

<script type="text/javascript" src="script.js" async />

Async scripts can be risky because, even though they run in parallel, they can still slow down parsing if the script is resource-intensive. Use this attribute for scripts that do not depend on the DOM and are lightweight.

`rel="preload"` Attribute:

Browsers can continue parsing while downloading external resources in the background. For example, You don't want the browser to wait for all CSS before it starts downloading the images. This can happen in parallel. This doesn't mean rendering the images, just downloading them. When it comes to rendering, the browser will have the images ready in its cache.

<link rel="preload" href="image.png" as="image" />

To mark a resource as important and make it more likely to be downloaded early in the rendering process, you can use a link tag with rel="preload".

<link href="style.css" rel="preload" as="style" />

There are other values for the relation attributes, such as preconnect, dns-prefetch, prefetch, and prerender. You can explore each of them on your own.

`fetchpriority` Attribute:

When a browser parses a web page and starts discovering and downloading resources like images, scripts, or CSS, it assigns them a fetch priority to download them in the best order. A resource's priority typically depends on its type and location in the document.

For example, images within the viewport might have a High priority, while early-loaded, render-blocking CSS using <link> tags in the <head> might have a Very High priority. Browsers are generally good at assigning priorities that work well, but they may not be optimal in every case.

The most important things to load for a good LCP are the resources that impact the initial view. We can use the fetchpriority attribute to tell the browser to load these resources with a higher priority.

high: The resource is critical for the page load.
low: The resource is not critical for the page load.
auto: The browser decides the priority.

Supported elements are: <img>, <script>, <link> and <iframe>.

<img src="image.png" fetchpriority="high" />
<link href="style.css" fetchpriority="high" />
<script src="script.js" fetchpriority="low" />
<iframe src="iframe.html" fetchpriority="low"></iframe>

2) Parse the CSS and build the CSSOM:

During the construction of the DOM by the browser parser, when it encounters a CSS file, it requests and receives the data in bytes. Just like an HTML document, it first converts the byte data into CSS characters, then into tokens and nodes. Finally, a tree structure known as the CSS Object Model or CSSOM is created.

A flowchart with five labeled boxes in sequence: Bytes, Characters, Tokens, Nodes, and CSSOM. Arrows indicate the progression from one step to the next.

The CSS Object Model (CSSOM) is a map of all CSS selectors and their properties, organized in a tree structure. This tree includes a root node and shows relationships like siblings, descendants, and children.

A flowchart depicts the cascade of CSS styles in HTML elements. The "body" element sets a font size of 16px. This propagates to "p", "span", and "img" elements with additional specific styles. One "span" within "p" has bold font-weight, another "span" beside "p" has red color, and the "img" has float set to right. A nested "span" within the first "span" adds display set to none.

Since there are many similarities between CSSOM and DOM, you might think CSSOM can be built incrementally like the DOM. However, CSSOM cannot be built this way because CSS rules can override each other at different points due to specificity. This means CSSOM can't be fully constructed until all the page's style sheets are fully loaded.

This is why CSS blocks rendering; until all CSS is parsed and the CSSOM is built, the browser can't determine where and how to position each element on the screen.

3) Executing the Linked JavaScript Files:

As we discussed, the execution of our JavaScript file depends on where we place it and which attribute we use with it. But how does the browser parse, compile, and execute our script?

Different browsers use different engines to handle these tasks. Here is an article by Andy Osmani, a member of the Google Chrome team, that explains how fast browsers can parse, compile, and execute JavaScript.

Also, if you want to know about the internals of the JavaScript Engine, you can visit the fantastic series of articles on dev.to

After the browser has finished parsing the HTML, constructing the DOM tree, and fetching all synchronous JavaScript files, the DOMContentLoaded event is fired.

For any scripts that need to access the DOM, such as manipulating it or listening for user interaction events, it is a good practice to wait for this event before running the scripts.

document.addEventListener('DOMContentLoaded', (event) => {
    // You can now safely access the DOM
});

After everything else like async/defer JavaScript, images, etc. has finished loading then the window.load event is fired.

window.addEventListener('load', (event) => {
    // The page has now fully loaded
});

4) Construct the render tree using DOM and CSSOM

After the CSSOM and DOM trees have been created, the next tree to be made is the Render tree. The Render tree only contains the nodes from the DOM and CSSOM that will eventually render on the screen and ignores all the non-visible nodes like (meta, script, link) and display:none. It will match the visible nodes to the appropriate CSSOM rules and apply them.

5) Calculate the Layout of the Page

After the completion of the render tree, now browser knows what to render, but does not know where to render. Therefore browser needs to calculate the layout of the page, in simple words you can say the position and size of every node of the render tree. The browser rendering engine traverses the render tree from the root node to the leaf nodes, calculating the coordinates where each node should be displayed.

Reflow happens when the size or position of things needs to be recalculated after the initial layout.

The first time sizes and positions are calculated, it's called "layout.
Any calculations after that are called "reflows."

For example, If a webpage has an image without specified dimensions, the initial layout happens before the image loads. Once the image's actual size is known, there will be a reflow to adjust the layout.

6) Paint the pixels of the elements in memory

After the layout calculation is complete, the next step is to filling the pixels of each element with the appropriate colors and create an image of the page to be displayed on the screen. This involves drawing text, colors, images, borders, shadows, and every visual aspect of the elements.

This needs to happen really fast - in less than a blink of an eye - so that everything looks smooth when you scroll or when things move on the page.

Repaint occurs when changes are made to the appearance of elements that affect their visibility but do not change the layout. For example, changes in visibility, background color, or outline.

To ensure repainting can be done even faster than the initial paint, the drawing to the screen is generally broken down into several layers. If this occurs, then compositing is necessary.

Certain elements automatically create layers, like <video> and <canvas>.
Some CSS properties also create layers, such as opacity, 3D transforms, and will-change.
When an element creates a layer, its descendant elements are usually included unless they need their own layer.

Caution with Layers:

While layers can improve performance, they use a lot of memory.
It's not good to overuse layers when trying to optimize web performance.

This painting process happens in memory, not directly on the screen. It's like creating a digital canvas for each element. The browser may create multiple layers in memory for different parts of the page, especially for elements that might need frequent updates or animations.

7) Composite the pixels if any of them overlap.

As we discussed earlier, the painting of render tree elements is done on multiple layers. Compositing arranges these layers in the correct order and blends them to create the final image you see.

Compositing determines how these layers stack on top of each other and how they blend where they overlap. The correct stacking order is crucial for both the functionality and visual design of the webpage.

For example, a simple social media post with these layers:

Background color
Post text
User profile picture
"Like" button

Compositing ensures:

The background appears behind everything
The text is visible on the background
The profile picture doesn't cover important text
The "Like" button stays on top, always clickable

Without compositing, elements might overlap incorrectly:

The profile picture could hide the text
The "Like" button might be underneath other elements, making it unclickable

Compositing arranges these layers correctly, so each element is visible and functional as intended. It's like properly arranging transparent sheets in a stack, each with a part of the image, to create the complete, correct picture.

8) Physically draw all the resulting pixels to the screen

At this point, the browser has a complete bitmap of what the page should look like, but it's still just in the computer's memory.

The final step is to take this bitmap and send it to the graphics hardware (like your computer's GPU) to physically illuminate the pixels on your screen.

Remember, understanding the rendering process is crucial for creating great web pages.