Tree Data Structure

ANSHUANSHU
12 min read

Introduction to Trees in C++

When working with data structures in C++, most beginners start with arrays, stacks, or queues—structures where data is stored linearly. But not all problems fit neatly into a straight line. That’s where trees, a powerful non-linear data structure, come in.

A tree organizes data in a hierarchical manner, much like a family tree or a file directory on your computer. Each piece of data is stored in a node, and these nodes are connected in a way that represents parent-child relationships.

In this article, we’ll dive deep into what trees are, why they’re important, and how to work with them in C++. We’ll explore various types of trees, their internal structure, how to implement them using both arrays and pointers, and how to traverse them using different techniques like preorder, inorder, and postorder.

Whether you're preparing for technical interviews or building efficient software systems, understanding trees is a must. Let’s get started!


Tree is a Non-Linear Data Structure

A tree is a widely used non-linear data structure that simulates a hierarchical structure. Unlike arrays, stacks, or queues—which are linear and store elements in a sequential manner—trees branch out in multiple directions.

Each element in a tree is called a node, and the structure begins with a special node called the root. Nodes can have zero or more child nodes, and there's a one-way connection (called an edge) between parent and child nodes.

Key Characteristics:

  • Non-linear: Nodes are not connected in a single line.

  • Hierarchical: Parent → Child relationships form a hierarchy.

  • No cycles: A tree does not contain loops or cycles.

Real-World Examples:

  • File systems (folders inside folders)

  • Organization charts (CEO → Managers → Employees)

  • HTML DOM Tree


Tree Consists of Root, Leaves, and Intermediate Nodes

In a tree data structure, every element is called a node, and each node plays a specific role based on its position:

Root Node

  • The topmost node in a tree.

  • It’s the starting point of the tree and has no parent.

  • Every tree has only one root.

Leaf Nodes

  • Nodes that do not have any children.

  • They are the endpoints or terminal nodes of the tree.

  • In a binary tree, a node with both left and right as NULL is a leaf.

Intermediate (Internal) Nodes

  • Nodes that are neither root nor leaves.

  • These nodes have at least one child and serve as a bridge between root and leaf nodes.

Example:

        1        ← Root
       / \
      2   3      ← Intermediate Nodes
     / \   \
    4   5   6    ← Leaf Nodes

Understanding these roles is crucial because many tree operations (like traversals, height calculation, or insertion) depend on identifying these types of nodes correctly.


Types of Trees in Data Structures

Different types of trees are used based on the problem requirements. Here are the most common ones:

1. General Tree

  • A node can have any number of children.

  • No strict rules.

  • Example: File system directory tree.

          A
        / | \
       B  C  D
          |
          E

2. Binary Tree

  • Each node has at most two children: left and right.

  • It's the most widely used type in programming and interview problems.

       1
      / \
     2   3

3. Full Binary Tree

  • Every node has either 0 or 2 children.

  • No node has only one child.

       1
      / \
     2   3
    / \ / \
   4  5 6  7

4. Complete Binary Tree

  • All levels are completely filled, except possibly the last.

  • In the last level, nodes are as left as possible.

  • Used in Heap implementation.

       1
      / \
     2   3
    / \  /
   4  5 6

5. Skewed Binary Tree

A special type of binary tree where nodes are connected in a single direction:

  • Left-Skewed: All nodes have only left children.

  • Right-Skewed: All nodes have only right children.

Left-Skewed:          Right-Skewed:
     1                     1
    /                       \
   2                         2
  /                           \
 3                             3

Binary Tree

A Binary Tree is a hierarchical data structure in which each node has at most two children, typically referred to as the left child and the right child.

This structure makes binary trees efficient for tasks like searching, sorting, and hierarchical data representation.

Characteristics:

  • Each node can have 0, 1, or 2 children.

  • The topmost node is called the root.

  • Nodes with no children are called leaf nodes.

  • Nodes with at least one child are called internal nodes.

Example:

       1
      / \
     2   3
    / \
   4   5

In this example:

  • 1 is the root node.

  • 2 and 3 are internal nodes.

  • 4 and 5 are leaf nodes.

Applications:

  • Expression trees (used in compilers)

  • Hierarchical storage (e.g., databases, file systems)

  • Binary Search Trees, Heaps, and AVL Trees are all based on binary trees.


Full Binary Tree

A Full Binary Tree is a special type of binary tree in which every node has either 0 or 2 children. In other words, no node in a full binary tree has only one child.

Characteristics:

  • Each internal node must have exactly two children.

  • Leaf nodes have no children.

  • It is a strict form of binary tree.

Example:

        1
       / \
      2   3
     / \ / \
    4  5 6  7

In this example:

  • Nodes 1, 2, and 3 have exactly two children.

  • Nodes 4, 5, 6, and 7 have no children (they are leaves).

  • So this is a valid full binary tree.

Note:

  • Not all binary trees are full.

  • A full binary tree does not require the tree to be balanced or complete—it just enforces the rule of 0 or 2 children per node.


Complete Binary Tree

A Complete Binary Tree is a binary tree in which all levels are completely filled, except possibly the last level, and the last level has all nodes as far left as possible.

This structure ensures that the tree is as compact as possible from left to right.

Characteristics:

  • Every level, except possibly the last, is fully filled.

  • The last level has nodes filled from left to right with no gaps.

Example:

        1
       / \
      2   3
     / \  /
    4  5 6

In this example:

  • Levels 0 and 1 are fully filled.

  • The last level has nodes 4, 5, and 6 filled from left to right.

  • No node appears on the right of a missing child, so it is a valid complete binary tree.

Use Cases:

  • Heap data structures (especially binary heaps) rely on the complete binary tree property.

  • Easy to represent in arrays due to their tightly packed structure.


Skewed Binary Tree

A Skewed Binary Tree is a special type of binary tree where each node has only one child. This creates a structure that resembles a linked list rather than a balanced tree.

There are two types of skewed trees:

  • Left Skewed Tree: Every node has only a left child.

  • Right Skewed Tree: Every node has only a right child.

Characteristics:

  • Nodes form a single chain, either to the left or right.

  • The height of the tree is equal to the number of nodes (no branching).

  • Inefficient for search operations as it behaves like a linked list.

Example:

Left Skewed Tree:

    1
   /
  2
 /
3

Right Skewed Tree:

1
 \
  2
   \
    3

Note:

  • Skewed trees can occur when data is inserted in sorted order into a binary search tree without balancing.

  • They degrade the performance of operations like search, insertion, and deletion.


Binary Tree Representation

Binary trees can be represented in two common ways in C++:


1. Array Representation

  • Used mostly for Complete Binary Trees or Heaps.

  • Nodes are stored in an array where the parent-child relationship is determined by indices.

Index relationship:

  • For a node at index i:

    • Left child is at 2*i + 1

    • Right child is at 2*i + 2

    • Parent is at (i - 1) / 2

Example:

If we have a binary tree:

        1
       / \
      2   3
     / \
    4   5

It can be stored as an array: [1, 2, 3, 4, 5]


2. Linked List (Pointer) Representation

  • Used for general binary trees, not necessarily complete.

  • Each node is represented by a structure/class with data and pointers to left and right children.

Example in C++:

struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;

    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};
  • This method is dynamic and allows efficient insertion and deletion.

  • Nodes are linked by pointers rather than by fixed array indices.


Which to use?

  • Use array representation for complete or nearly complete trees where you want efficient memory usage.

  • Use linked list representation for general binary trees or when the shape of the tree is irregular.


Drawbacks of Array Representation for Binary Trees

While using arrays to represent binary trees (especially complete binary trees) is straightforward and memory-efficient in some cases, it has several limitations:

  1. Wasted Space for Sparse Trees

    • If the tree is not complete or balanced, many array positions remain unused, leading to wasted memory.

    • For example, if a tree is skewed, the array size must be as large as the height of the tree, causing large gaps.

  2. Fixed Size

    • Arrays require a predefined size or resizing, which can be costly or complicated.

    • If the tree grows beyond the allocated array size, you need to create a larger array and copy elements, which is inefficient.

  3. Limited Flexibility

    • Array representation is only efficient for complete or nearly complete trees.

    • For irregular or sparse trees, pointer-based linked structures are more flexible and natural.

  4. Complex Insertions and Deletions

    • Adding or removing nodes in arbitrary positions is harder with arrays, especially if you want to maintain the complete binary tree property.

Because of these drawbacks, pointer-based linked representations are preferred for general binary trees that are not guaranteed to be complete.


Tree Traversal

Tree traversal refers to the process of visiting each node in a tree exactly once in a systematic way. Traversals are essential for searching, printing, or manipulating tree data.

There are three common types of depth-first traversals used for binary trees:


1. Inorder Traversal (Left, Root, Right)

  • Visit the left subtree first.

  • Then visit the root node.

  • Finally, visit the right subtree.

Use case:
In binary search trees (BST), inorder traversal visits nodes in sorted order.

Example:
For the tree:

       1
      / \
     2   3
    / \
   4   5

Inorder traversal: 4, 2, 5, 1, 3


2. Preorder Traversal (Root, Left, Right)

  • Visit the root node first.

  • Then traverse the left subtree.

  • Finally, traverse the right subtree.

Use case:
Used to create a copy of the tree or to get prefix expression in expression trees.

Example:
Preorder traversal: 1, 2, 4, 5, 3


3. Postorder Traversal (Left, Right, Root)

  • Visit the left subtree first.

  • Then visit the right subtree.

  • Finally, visit the root node.

Use case:
Useful for deleting trees or evaluating postfix expressions.

Example:
Postorder traversal: 4, 5, 2, 3, 1


Implementation (Recursive)

Here’s a simple C++ example for inorder traversal:

void inorder(TreeNode* root) {
    if (root == nullptr) return;
    inorder(root->left);
    cout << root->data << " ";
    inorder(root->right);
}

Similar recursive functions can be written for preorder and postorder traversals.


Explanation of Combined Tree Traversals in One Loop

Core Idea:

Instead of doing separate traversals for preorder, inorder, and postorder, we use a stack to simulate the recursive process and keep track of which traversal step we are on for each node.


Stack Structure

We use a stack that stores pairs: (node, state)

  • node: The current tree node we are visiting.

  • state: An integer (1, 2, or 3) that tracks how many times we have "visited" this node during the traversal process.


States and What They Mean

  • state = 1:
    First time we encounter the node.
    Action:

    • Visit the node for preorder traversal (since preorder is Root → Left → Right, and we visit Root first).

    • Push the left child if it exists (so we visit left subtree next).

    • Increase the state to 2 to mark that preorder is done.

  • state = 2:
    After finishing the left subtree.
    Action:

    • Visit the node for inorder traversal (which happens after left subtree).

    • Push the right child if it exists (so we visit right subtree next).

    • Increase the state to 3.

  • state = 3:
    After finishing the right subtree.
    Action:

    • Visit the node for postorder traversal (which happens after left and right subtrees).

    • Pop the node off the stack since it’s fully processed.


Walkthrough on a Small Tree Example

Consider this tree:

      1
     / \
    2   3

Initial stack: [(1,1)]

  1. Top is (1,1):

    • Preorder: Add 1 to preorder list.

    • Push left child (2,1).

    • Update (1,1) to (1,2).
      Stack: [(1,2), (2,1)]

  2. Top is (2,1):

    • Preorder: Add 2 to preorder list.

    • No left child, so no push.

    • Update (2,1) to (2,2).
      Stack: [(1,2), (2,2)]

  3. Top is (2,2):

    • Inorder: Add 2 to inorder list.

    • No right child, so no push.

    • Update (2,2) to (2,3).
      Stack: [(1,2), (2,3)]

  4. Top is (2,3):

    • Postorder: Add 2 to postorder list.

    • Pop (2,3).
      Stack: [(1,2)]

  5. Top is (1,2):

    • Inorder: Add 1 to inorder list.

    • Push right child (3,1).

    • Update (1,2) to (1,3).
      Stack: [(1,3), (3,1)]

  6. Top is (3,1):

    • Preorder: Add 3 to preorder list.

    • No left child, no push.

    • Update (3,1) to (3,2).
      Stack: [(1,3), (3,2)]

  7. Top is (3,2):

    • Inorder: Add 3 to inorder list.

    • No right child, no push.

    • Update (3,2) to (3,3).
      Stack: [(1,3), (3,3)]

  8. Top is (3,3):

    • Postorder: Add 3 to postorder list.

    • Pop (3,3).
      Stack: [(1,3)]

  9. Top is (1,3):

    • Postorder: Add 1 to postorder list.

    • Pop (1,3).
      Stack: []


Summary

  • We process each node three times, once for each traversal order, controlled by the state.

  • Using the stack and state lets us avoid recursion and get all three traversals efficiently in one pass.

  • This method is memory efficient and easy to adapt to different traversal orders.


Conclusion

Trees are a fundamental non-linear data structure that provide an efficient way to represent hierarchical data. Understanding different types of trees—such as binary trees, full binary trees, complete binary trees, and skewed trees—along with their characteristics is essential for mastering data structures.

In C++, trees can be represented using arrays or linked nodes, each with its own advantages and drawbacks. Tree traversal techniques like inorder, preorder, and postorder allow us to process and manipulate tree data effectively. Moreover, combining all three traversals into a single iterative algorithm showcases how understanding the underlying mechanics can lead to more optimized solutions.

Mastering trees and their traversals not only strengthens your grasp on data structures but also lays the foundation for solving complex problems in algorithms, databases, and real-world applications. Keep practicing, and explore advanced tree structures like binary search trees, AVL trees, and heaps to deepen your knowledge.

1
Subscribe to my newsletter

Read articles from ANSHU directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

ANSHU
ANSHU