Priority Queue (Heap) Explained: Applications and Coding Challenges

Nguyen Quoc BaoNguyen Quoc Bao
8 min read

“We use operating systems every day—on our laptops, phones, and even embedded devices. But with limited CPU and memory, how does a computer efficiently manage multiple tasks at once? The OS must decide which process runs first.”

Imagine you’re gaming online, streaming cat videos, and fixing bugs at the same time. Not all tasks are equally important—your video call needs real-time processing, while the file download can run in the background. This is where the priority queue (heap) comes in! It helps the OS schedule tasks efficiently by ensuring that higher-priority processes execute first while lower-priority tasks wait for their turn.

But priority queues aren’t just useful in OS scheduling—they’re everywhere! From Dijkstra’s shortest path algorithm to data compression (Huffman coding) and even real-time systems, priority queues play a crucial role in optimizing performance.

In this article, we’ll break down how priority queues work, explore their real-world applications, and solve some simple yet interesting problems using them. Let’s dive in! 🚀

1/ Definition

Imagine you’re at a hospital emergency room. Patients aren’t treated on a first-come, first-served basis—instead, those with critical conditions are prioritized over those with minor injuries. This is exactly how a priority queue works!

A priority queue is a data structure where each element has a priority, and the element with the highest (or lowest) priority is processed first. It’s commonly implemented using a heap. For example:

• In CPU scheduling, high-priority processes run before lower-priority ones.

• In Dijkstra’s shortest path algorithm, the closest node is always processed first.

• In Huffman coding, the least frequent characters get merged first.

Now, let’s dive deeper into how priority queues work!

import heapq

# Step 1: Create an empty heap (priority queue)
# (By default, heapq implements a min-heap, meaning the smallest value is always at the root)
heap = []

# Step 2: Push elements into the heap (The first element of the tuple is the priority)
# Smaller numbers have higher priority in a min-heap.

# Push (3, "Task A") -> Heap: [(3, "Task A")]
heapq.heappush(heap, (3, "Task A"))
#       (3, "Task A")

# Push (1, "Task B") -> Heap: [(1, "Task B"), (3, "Task A")]
heapq.heappush(heap, (1, "Task B"))
#       (1, "Task B")
#      /
# (3, "Task A")

# Push (4, "Task C") -> Heap: [(1, "Task B"), (3, "Task A"), (4, "Task C")]
heapq.heappush(heap, (4, "Task C"))
#       (1, "Task B")
#      /            \
# (3, "Task A")   (4, "Task C")

# Push (2, "Task D") -> Heap: [(1, "Task B"), (2, "Task D"), (4, "Task C"), (3, "Task A")]
heapq.heappush(heap, (2, "Task D"))
#       (1, "Task B")
#      /            \
#  (2, "Task D")   (4, "Task C")
#     /
# (3, "Task A")

# Now the heap is built and ready for processing the highest priority task (lowest number).

In a min-heap, in this case, the smallest values are always stored at the top (the root), and each parent node holds a value smaller than its children. This structure allows efficient retrieval of the minimum value. However, a heap is not a fully sorted list. Rather, it is a heap tree, where the heap property is maintained: each parent node is smaller than its children.

When a new node is pushed into the heap, it is initially placed at the end of the tree. Then, the heap adjusts itself to maintain the heap property. This process involves comparing the new node with its parent node and swapping if necessary. This comparison and swapping process continue up the tree until the heap property is restored, ensuring that the smallest value is always at the top.

import heapq

heap = [(1, "Task B"), (2, "Task D"), (4, "Task C"), (3, "Task A")]
heapq.heapify(heap)

# Pop the first element (Task B with priority 1)
priority, task = heapq.heappop(heap)
print(f"Processing {task} with priority {priority}")
#       (2, "Task D")
#      /            \
#  (3, "Task A")   (4, "Task C")

# Pop the next element (Task D with priority 2)
priority, task = heapq.heappop(heap)
print(f"Processing {task} with priority {priority}")
#       (3, "Task A")
#      /
#  (4, "Task C")

# Pop the next element (Task A with priority 3)
priority, task = heapq.heappop(heap)
print(f"Processing {task} with priority {priority}")
#       (4, "Task C")

# Pop the last element (Task C with priority 4)
priority, task = heapq.heappop(heap)
print(f"Processing {task} with priority {priority}")
# (Heap is now empty)

When popping an element from a min-heap, the smallest element (root node) is always removed first. After removal, the heap must maintain the heap property, so the last element in the heap takes the root’s place temporarily.

Then, the heap reorganizes itself by the root swapping with the smaller of its two children until the heap property is restored.

This ensures that after every pop, the next smallest element becomes the new root, keeping the heap structure intact.

2/ Coding Problems

  1. Maintain a stream of numbers and efficiently return the kth largest element at any time.

Example:

KthLargest(3, [4,5,8,2]);

add(3); // Returns 4

add(5); // Returns 5

The problem requires maintaining a dynamic stream of numbers and efficiently retrieving the k-th largest element at any time. Instead of sorting the list in descending order each time a new element is added and then extracting the first k values, we use a Min-Heap (Priority Queue) of size k.

First Intuition: The heap’s root always holds the smallest/largets element. We leverage this property to solve this problem.

First Approach: Max-Heap

Second Intuition: The max-heap keeps the largest element at the root, then the second largest element, then the third and so on. What if we pop the heap k - 1 times to get the result ?

  • Heapify the array, pop the heap k - 1 times until reaching the kth elements.

  • Push the popped elements back to the max-heap after retrieving kth element

  • Time Complexity

  • Heap Push: O(log n)

    Initial heap construction: O(n)

    Each add(val) operation: O(n log n) (worst case)

import heapq

class KthLargest:
    def __init__(self, k, nums):
        self.k = k
        # simulate a max-heap with min-heap of negative values
        # the largest element of this min-heap is the smallest element of the max-heap 
        self.max_heap = [-num for num in nums] 
        heapq.heapify(self.max_heap) 

    def add(self, val):
        heapq.heappush(self.max_heap, -val)  

        # Retreive the kth smallest element of the min-heap 
        # (which is the kth largest element of the max-heap)
        temp = []
        for _ in range(len(self.k - 1): 
            temp.append(heapq.heappop(self.max_heap))

        # retreive the smallest negative element of the min-heap after popping k - 1 times
        # revert the element to get the kth largest element
        kth_largest = -heapq.heappop(self.max_heap)  

        for num in temp:
            heapq.heappush(self.max_heap, num)

        return kth_largest

kthLargest = KthLargest(3, [4, 5, 8, 2])
print(kthLargest.add(3)) # returns 4
print(kthLargest.add(5)) # returns 5

Second Approach: Min-Heap

Third Intuition: In a min-heap, the largest element are kept at the bottom, then comes the second largest and so on. What if we can pop the kth largest element right on top of the heap?

• Store only the k largest elements seen so far.

• Pop a newly pushed element if the length of the min-heap exceeds k

  • Time Complexity

    Heap Push: O(log k)

    Each add(val) operation: O(n log k) (worst case).

import heapq

class KthLargestElement:
    def __init__(self,k,nums):
        self.k = k
        self.min_heap = []
        for num in nums:
            self.add(num)

    def add(self,num):
        heapq.heappush(self.min_heap, num)
        # the largest element of the min-heap is at the bottom
        # the kth largest element of the min-heap is at the root
        # any other element after kth element will be popped           
        if len(self.min_heap) > self.k:
            heapq.heappop(self.min_heap) 
        return self.min_heap[0]

kthLargest = KthLargestElement(3, [4, 5, 8, 2])
print(kthLargest.add(3)) # returns 4
print(kthLargest.add(5)) # returns 5
  1. Given a list of points on a 2D plane, return the k closest points to the origin (0,0).

Example: Input: [[1,3],[-2,2]], k=1, Output: [[-2,2]]

First Intuition: In a min-heap, the smallest element is at the top. What if we push all of the points and their distance to a min-heap, then extract k points with the smallest distance ?

The problem requires to find the k closest points to the origin (0,0) given a list of points on a 2D plane. The goal is to return the k points with the smallest distances efficiently by calculating the Euclidean/ Manhattan distance between a given point and the origin.

First Approach: Min-Heap

  • Calculate the Manhattan Distance between a point and the origin

  • Push the (distance, point) tuple to the min-heap

  • Extract the first k element

  • Time Complexity: O(n log n)

import heapq

def calc_distance(x,y):
    return abs(x) + abs(y) # Manhattan distance

def solve(points,k):
    distance_heap = []
    for point in points:
        d = calc_distance(point[0],point[1])
        heapq.heappush(distance_heap,(d,point))

    return [heapq.heappop(distance_heap)[1] for _ in range(k)]

points = [[1,4],[1,-1],[2,-2]]
k = 2
print(solve(points,k)) # Output: [[1, -1], [2, -2]]

Second Approach: Max-Heap

Second Intuition: We only need k smallest distance, while in the first implementation,the time complexity to push new element into the heap is O(log n). How can we reduce the time complexity to O(log k) ? Can we maintain the heap to be only k in length ?

  • Calculate the Manhattan Distance between a point and the origin and negate distance for max-heap behaviour.

  • Push the (distance, point) tuple to the max-heap

  • If the length of the max-heap exceeds k, pop the farthest points

  • Time Complexity: O(n log k)

import heapq

def calc_distance(x, y):
    return abs(x) + abs(y)  # Manhattan distance

def solve(points, k):
    max_heap = []

    for point in points:
        d = -calc_distance(point[0], point[1])  
        heapq.heappush(max_heap, (d, point))

        if len(max_heap) > k:
            heapq.heappop(max_heap)  # Remove farthest point

    return [point for (_, point) in max_heap]


points = [[1, 4], [1, -1], [2, -2]]
k = 2
print(solve(points, k))  # Output: [[-2, -2], [1, -1]]

3/ Conclusion

Heaps are a powerful and versatile data structure, offering efficient solutions for priority queue operations, sorting, and graph algorithms like Dijkstra’s shortest path. The key takeaway is understanding the O(log n) complexity of heap insertions and deletions, while heapifying an entire array is remarkably faster at O(n) due to the bottom-up approach.

By using Python’s heapq module, we can efficiently implement max-heaps, min-heaps, and solve problems like finding the kth largest or smallest elements in O(n log k) time. Mastering heaps not only improves problem-solving skills but also lays the foundation for advanced algorithms that rely on efficient priority management.

Keep practicing with real-world applications like task scheduling, event processing, and memory management, and soon, heaps will become second nature in your coding arsenal! 🚀

0
Subscribe to my newsletter

Read articles from Nguyen Quoc Bao directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nguyen Quoc Bao
Nguyen Quoc Bao