The 'Diffing' Algorithm Explained

Tito AdeoyeTito Adeoye
9 min read

Ever been in an interview and heard the question, 'What are the algorithms React is built on?' only to feel your heart rate spike and question your life choices? Well, don’t worry! Today, we’re diving into one of them: the Diffing Algorithm.

What is the Diffing Algorithm?

Before we talk about the Diffing Algorithm, let's start with the foundation: the Virtual DOM. We know the DOM, the Document Object Model, is a tree-like structure with nodes that denote HTML elements and how they are related to each other. You can access and update the DOM and its elements using Javascript. For instance, you can change the color of this article's title to red by accessing it via the DOM, (the document object), and updating it's color.

document.getElementById('title').style.color = 'red';

You can imagine the Virtual DOM as a lightweight copy of the actual DOM. It acts as a blueprint that React uses to make updates more efficient. Instead of changing the DOM directly which can be a resource-intensive and slow process, React updates the Virtual DOM first and 'reconciliates' this update with the DOM. It performs this reconciliation process using the Diffing Algorithm. The reconciliation process is basically syncing your virtual DOM copy with the real DOM.

This algorithm is used by React to compare this updated version with the previous one, identify only the changes that need to be made, and then update the real DOM in the most efficient way possible.

React assumes that:

  • if two elements are of different types, they will produce different trees (DOM structures). It will then completely replace the old DOM subtree with the new one

  • if two elements are of the same type, they will produces similar trees. It will then compare their children.

How does React even determine if they are the same or different? By comparing both of their root elements. If they are of different types, it takes a drastic approach and completely 'tears down' the old DOM tree and rebuilds a new one from scratch using the Virtual DOM. This is because React assumes that different root elements mean the structure and purpose of the components have changed significantly, making it more efficient to start afresh rather than attempt to reconcile the differences.

If you've seen a visual representation of the DOM structure, you would understand why this makes sense. The DOM not only allows access to its child nodes (elements) for updates, but it also provides a clear view of the hierarchy of the tree-the relationships between nodes, from parent to child. This structure starts from the top, the root element. If that changes, it fundamentally alters all the relationships underneath it. In this context, a complete replacement of the subtree is the most logical and efficient approach.

It unmounts the old subtree via componentWillUnmount(), effectively removing all the elements associated with the old root from the real DOM.

After discarding the old subtree (as well as any state associated with it), React constructs a new DOM subtree from the updated Virtual DOM, and creates new elements corresponding to the new root and its children. These new elements are then inserted into the real DOM, replacing the old subtree entirely.

An example!

Let's say we have a component that initially renders a main root/parent element that is replaced with a section.

// before change
<div id="root">
  <header>
    <h1>Title</h1>
  </header>
  <main>
    <p>Content</p>
  </main>
  <footer>
    <p>Footer content</p>
  </footer>
</div>

// after change
<div id="root">
  <header>
    <h1>Title</h1>
  </header>
  <section>
    <!-- new content or structure here -->
  </section>
  <footer>
    <p>Footer content</p>
  </footer>
</div>

React will detect that the root elements have changed. As a result:

  • It will completely remove the <main> and its children from the DOM (i.e. the subtree).

  • And then create a new subtree with the <section> element as the root, based on the updated virtual DOM, and insert it into the DOM.

Same DOM elements

When the root elements are the same, however, React compares their attributes (props). If they differ, the DOM is updated accordingly. Here, React modifies only the className on this DOM node.

<div className="before" />

<div className="after" />

React also recursively compares the elements' children.

If you had an unordered list of names, and want to add another, React will match the first two items, <li>David</li> <li>Solomon</li> , of both trees, and then insert the <li>Damon</li> node.

// before change
<ul>
  <li>David</li>
  <li>Solomon</li>
</ul>

// after change
<ul>
  <li>David</li>
  <li>Solomon</li>
  <li>Damon</li>
</ul>

This only works if insertion occurs at the end of the list. If the new addition is at the top, matching both trees would not happen. React would, instead, mutate every child, not knowing both subtrees have similar child elements. This can cause worse performance.

A way to ensure efficient reconciliation is by using keys.

You ever been mapping over an array of data to render the items in the UI and you get this error in your console: 'Each child in a list should have a unique “key” prop'?

React uses this special key prop to identify elements, matching them between the DOM and the virtual DOM trees, and efficiently manage updates and reordering of lists. If the key remains the same between renders, React assumes the element has not changed. This allows React to reorder, insert, or delete elements efficiently without mutating and re-rendering the entire list.

// before change
<ul>
  <li key='0'>David</li>
  <li key='1'>Solomon</li>
</ul>

// after change
<ul>
  <li key='0'>David</li>
  <li key='1'>Solomon</li>
  <li key='2'>Damon</li>
</ul>

Tacking on a key to every item is not that hard. You can hash a part of your item, use it's own id property or, use the map method's index parameter (as long as the list is not going to be reordered). The key takeaway, pun intended😜, is to make sure your keys are "stable, predictable, and unique".

So next time you get this error and roll your eyes or ignore it, now you know why you shouldn't😁

Browsers without the Virtual DOM

Without a virtual DOM or reconciliation mechanism like React’s, updating the DOM directly can be inefficient and costly in terms of performance. Direct manipulation can lead to performance issues, especially with large or complex UIs. Every time you change the DOM, the browser may need to recalculate styles, reflow the layout, and repaint parts of the page. This can lead to slow performance and jank, especially with frequent updates.

React's targeted updates minimizes the amount of work the browser has to do, making the DOM manipulation more efficient and the user experience smoother. It also batches multiple updates into a single render cycle, reducing the number of re-renders and DOM manipulations. It does this (or primarily used to) by grouping state changes and updates that occur within event handlers or lifecycle methods, applying them in a single pass.

Reconciliation in React 18

React 18, also referred to as Concurrent React, introduces Concurrent Rendering, which fundamentally changes how reconciliation is handled. The new rendering engine allowed React to work on multiple tasks at the same time, interrupting and resuming rendering work, making the UI more responsive even during heavy updates by prioritizing urgent updates (like user input) over non-urgent ones (like background data fetching).

Concurrency is one of the key concepts of React's Fiber's architecture. You can learn more about this here. But the gist of it is that it enables React to break down rendering work into small units, called fibers, and spread this work over multiple frames, making it possible for React to pause work, prioritize tasks, and continue rendering later if necessary. The introduction of Fiber was a major evolution for React, enabling more advanced features and laying the groundwork for future improvements, such as the concurrent rendering features introduced in later versions like React 18.

You are probably wondering how this affects reconciliation.

Well, as well as introducing Concurrent Rendering, React 18 also introduced an enhanced reconciliation algorithm that takes advantage of it. This new reconciliation algorithm in React 18 divides the reconciliation work into smaller units (fibers) and prioritizes them based on their importance, allowing React to pause the reconciliation process if something more urgent comes up, like a user typing, and then resumes it later. This reduces the likelihood of blocking the main thread, leading to smoother user interactions.

React can also partially render components and then continue rendering other components in subsequent frames. This allows React to spread the work across multiple frames, preventing long render cycles that could cause the UI to freeze.

React 18 also introduced automatic batching of updates, meaning that multiple state updates, even across different components and in asynchronous operations like network requests and timeouts, are batched together in a single render pass.

I mentioned batched updates before in event handlers, but in React 18, batching is automatic and no longer restricted to updates that occur within the same event handler.

Before React 18, if you had multiple setState calls within a click event handler, React would batch these updates into a single render to optimize performance.

// before React 18
function handleClick() {
    setCount(c => c + 1);
    setCount(c => c + 1);
    setCount(c => c + 1);
}

However, if updates happened outside of event handlers, such as in asynchronous callbacks, they were not automatically batched. This meant that each update triggers a separate re-render, potentially leading to performance inefficiencies.

// before React 18; Without automatic batching for async updates:
function handleClick() {
  setTimeout(() => {
    setCount((c) => c + 1); // triggers a render
    setCount((c) => c + 1); // ttriggers another render
  }, 1000);
}

In this example, each setCount call would trigger a separate render, as React didn't batch updates in asynchronous code.

In React 18, however, automatic batching is expanded to cover not just synchronous events but also asynchronous code. This means that updates triggered in promises, setTimeout callbacks, and other asynchronous functions are now automatically batched, reducing the number of re-renders and improving performance.

So, the block of code in the setTimeout function triggers just one render.

// With automatic batching in React 18:
function handleClick() {
  setTimeout(() => {
    setCount((c) => c + 1); // updates are batched
    setCount((c) => c + 1); // still a single render
  }, 1000);
}

By batching updates, React reduces the number of reconciliation cycles, which, in turn, reduces unnecessary re-renders and makes the UI more efficient.

React 18 also introduces the startTransition API which allows developers to mark certain updates as non-urgent (transitions), allowing these updates to be handled with lower priority, rendering them in the background without blocking more urgent updates. This affects reconciliation by allowing it to be deferred. Updates marked as transitions are handled in the background, allowing React to prioritize more critical updates first.

Finally, React 18 expands the Suspense API to support asynchronous operations like data fetching. This allows components to wait for data to be ready before rendering, improving the overall user experience. With Suspense, React can delay rendering components until the necessary data is available. This prevents unnecessary renders and reconciliation work, reducing the load on the browser and improving performance.

Conclusion

React’s approach to performance and efficiency focuses on minimizing direct DOM manipulation, reducing reflows and repaints, and batching updates. By leveraging the virtual DOM and reconciliation, React helps ensure a smoother and more responsive user experience. Reconciliation in React 18 is more powerful and flexible than in previous versions, thanks to concurrent rendering and the introduction of new APIs like automatic batching, transitions, and expanded Suspense capabilities. These enhancements allow React to prioritize and defer work as needed and manage updates more efficiently, leading to a smoother, more responsive user experience.

Happy Coding!🎉

0
Subscribe to my newsletter

Read articles from Tito Adeoye directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Tito Adeoye
Tito Adeoye