All About Sorting Algorithms - Part 1 - Insertion Sort
In this article, we will discuss one of the main sorting algorithms that every software engineer should know: insertion sort. I will explain the algorithm based on what I have learned. I love to explain things in the simplest way (hopefully)!
I hope you will learn something from the article. Let me know what you think!
Before we begin, I want to first discuss the difference between Stable and Unstable sorting algorithms. This information is relevant to sorting algorithms.
For example, we have key-value pairs of elements in an array -
[(5, "a"), (2, "b"), (5, "c")]
Assuming we are sorting the elements by key. There are two possibilities of how they will be sorted -
# Possibility 1
[(2, "b"), (5, "a"), (5, "c")]
# Possibility 2
[(2, "b"), (5, "c"), (5, "a")]
They are both correct.
But, notice in the original array that key 5 and key 5 are a tie, and the pair (5, "a")
comes before the pair (5, "c")
, this is the original order of elements. In the first possibility, the original order is preserved because (5, "a")
comes before the pair (5, "c")
. In the second possibility, the order after sorting is (5, "c"), (5, "a")
, the original order is not preserved.
When there is a tie of values, the preservation of the original order decides the algorithm's stability. If the sorting algorithm guarantees the preservation of the original order of values, it is a stable sorting algorithm. Otherwise, it is unstable. It is generally a good practice to keep the algorithm stable.
Now let's talk about insertion sort!
Fun fact: it is not used often because it is less efficient than other sorting algorithm options. But we still ought to know.
Insertion sort is all about comparing a value with another value, swapping them if they are not sorted in ascending order.
For example, we have an array of values, and the array is named arr
(duh) -
[ 4 5 2 1 8 ]
Because we need to compare values, we can have two-pointers. Let's put them next to each other at the start of the array.
[ 4 5 2 1 8 ]
j i
Compare arr[j]
and arr[i]
, 4 <= 5, no swap is needed because they are already sorted. Now we move up the pointers j
and i
.
[ 4 5 2 1 8 ]
j i
Compare arr[j]
and arr[i]
, 5 > 2, we need to swap the values.
[ 4 2 5 1 8 ]
j i
But is one comparison sufficient here? No. Because notice how 4 is greater than 2. The portion before i
is not completely sorted. To make sure the visited portion is sorted, we need to move down j
while j
is still inbound, and compare arr[j]
to arr[j + 1]
.
[ 4 2 5 1 8 ]
j i
arr[j + 1]
= 2, 4 > 2, swap 4 and 2
[ 2 4 5 1 8 ]
j i
Now, we checked every value in the portion we iterated through. The portion we iterated through is sorted. We have no other value we need to check because j
is now out of bounds, so we continue sorting the remaining portion of the array.
[ 2 4 5 1 8 ]
j i
Compare arr[j]
and arr[i]
, 5 > 1, swap 5 and 1
[ 2 4 1 5 8 ]
j i
But is one comparison sufficient here? No. Because notice how 2 is greater than 1. The portion before i
is not sorted. To make sure the visited portion is sorted, we need to move down j
while j
is still inbound, and compare arr[j]
to arr[j + 1]
.
[ 2 4 1 5 8 ]
j i
arr[j + 1]
= 1, 4 > 1, swap 4 and 1, keep moving j
backward
[ 2 1 4 5 8 ]
j i
Compare arr[j]
to arr[j + 1]
. 2 > 1, swap 2 and 1, and keep moving j backward in case there is more value we need to check.
[ 1 2 4 5 8 ]
j i
Now, we checked every value in the portion we iterated through. The portion we iterated through is sorted. We have no other value we need to check because j
is now out of bounds, so we continue sorting the remaining portion of the array.
[ 1 2 4 5 8 ]
j i
5 <= 8, no swap is needed.
We have reached the end of the array. Our array is sorted now!
Here is the code for this algorithm -
# iterate through the array with i pointer from index 1
# to the end of the array
for i in range(1, len(arr)):
# set j at previous position next to i pointer for comparision
j = i - 1
# when swap is needed because arr[j] is greater
# than arr[j + 1], and j is in bound
while j >= 0 and arr[j] > arr[j + 1]:
# save value at j + 1 to tmp variable
tmp = arr[j + 1]
# assign value at j + 1 position to value at j
arr[j + 1] = arr[j]
# assign value at j position to tmp
arr[j] = tmp
# keep moving j pointer backward to ensure the portion
# is sorted
j -= 1
# we sorted the array in-place, so just return arr
return arr
Is it a stable sorting algorithm?
Yes. Because assuming we have a tie in the array [1 3 3 2]
, we will perform a swap when comparing the second 3 and 2. After this swap, the array becomes [1 3 2 3]
. Now we compare the first 3 and 2. We will perform a swap, and the array becomes [1 2 3 3]
. Notice how the order of the 3s is preserved.
What is the time complexity?
It is the best-case scenario when the array is already sorted. We will iterate through the array once without having to swap anything. The time complexity, in this case, is O(n)
.
It is the worst-case scenario when the array is in reverse order. For every single value we iterate through, the while loop will execute a maximum number of times. The time complexity, in this case, is O(n^2)
.
What is the space complexity?
It is O(1)
. We sorted the array in-place, we did not use extra space.
Thank you for reading!
Subscribe to my newsletter
Read articles from Lingyun Dai directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Lingyun Dai
Lingyun Dai
It is fun to learn and collaborate with talented people.