How to analyze algorithms performance

Introduction

Given a problem, there may be several algorithms that allow to solve it. However, they may differ in terms of performance: some algorithms may need more memory, others may require more time, others more instructions. It becomes then important to choose wisely the algorithm that best fits the scenario in which you develop.

How can you classify the performance of an algorithm? Mainly in 2 ways:

how much time it requires to complete (time complexity);
how much memory space it requires to complete (space complexity).

The lower the time is required, the faster the algorithm is. Similarly, the lower the memory space is needed, the lighter the algorithm is.

However, how can we model an algorithm to evaluate its complexity? The RAM model comes into help.

The RAM model

The RAM (Random Access Memory) model founds itself over 2 hypothesis:

each simple operation (+, *, /, -, if/else, variable assignment) takes exactly 1 step, no matter of which CPU we use and which optimizations may be in place;
loops (for, while) are not simple operations. They instead group multiple single-step operations.

Given these 2 assumptions, an algorithm can be translated into a mathematical function based on the number of steps needed to complete.

Let’s take for example the following algorithm to find the biggest element in an array of size n:

def max_element(arr):
    # init phase
    n = len(arr)               # var assignment -> 1 step
    max_val = arr[0]           # array access + var assignment -> 2 steps
    # loop phase    
    for i in range(1, n):      # loop -> var increase + var comparison: 2 steps
        if arr[i] > max_val:   # if -> var comparison: 1 step
            max_val = arr[i]   # array access + var assignment -> 2 steps
    return max_val

By using the RAM model, we can say that we need:

3 steps for the initialization phase;
5 steps within the loop phase times n-1, which means 5n-5 steps.

We could then say that the algorithm requires (5n-5)+3=5n-2 steps to complete. However, is it always the case? What if I don’t perform the if statement at each loop?

Indeed, the RAM model gives different results for different inputs. The algorithm performs 5n-2 steps only in the worst-case scenario where each loop forces the algorithm to go through the if statement (e.g. with an input array as [1,2,3,4,5]) . But in the best-case scenario where the input array is [5,4,3,2,1] the RAM complexity reduces to 3n because the instructions:

if arr[i] > max_val:   # if -> var comparison: 1 step
    max_val = arr[i]   # array access + var assignment -> 2 steps

are never performed. And what if the input array is [3,4,5,1,2]? And what if is [1,4,2,3,5]?

The RAM model provides an effective way to map an algorithm to a mathematical expression in function of the input size. However it loses its efficacy by requiring too much details to be specified, which are usually not necessary to evaluate the performance of an algorithm. To soften such sharp angles, we need to do a step further and talk about the Big-O notation model.

The Big-O Notation model

The Big-O Notation model is a mathematical concept used to bound the asymptotic behaviour of a function g(n) to well-known and established dominance classes.

Being a mathematical concept, the goal of such notation is to define the asymptotic behaviour of a function g(n) through another function f(n) (with n representing the input size) as either its lower-bound or upper-bound (or both):

g(n) = O(f(n)): it means that exists some constant c such that c\f(n) > g(n) (i.e. f(n) is an upper bound of g(n)*);
g(n) = Ω(f(n)): it means that exists some constant d such that g(n) > d\f(n) (i.e. f(n) is a lower bound of g(n)*);
g(n) = Θ(f(n)): it means that exists some constants c and d such that c\f(n) > g(n) > d*f(n) (i.e. f(n) is both an upper and lower bound of g(n)*).

The O(f(n)) definition (usually shortened as O(n), hence the Big-O name) is the one we are interested in and is usually called as the Order of a function.

Dominance classes

The Big-O Notation groups functions in set of classes according to what is their most immediate upper bound function:

O(1) (aka Constant Functions);
O(log(n)) (aka Logarithmic Functions);
O(n) (aka Linear Functions);
O(n\log(n)) (aka Superlinear Functions*);
O(n²) (aka Quadratic Functions);
O(n³) (aka Cubic Functions);
O(c^n) (aka Exponential Functions);
O(n!) (aka Factorial Functions).

The relationship between these classes is the following:

O(n!) » O(c^n) » O(n³) » O(n²) » O(n\log(n)) » O(n) » O(log(n)) » O(1)*

Each class dominates all the classes on its right, meaning that looking at the asymptotic behaviour, their contribution is negligible when compared with the most dominant class (i.e. its contribution can be ignored).

Ignoring the weaker dominance classes is a key-concept borrowed that allows to classify any function within such classes. Let’s take the function g(n)=3n³+n²+9 as an example and compute its order. The function has 3 main contributions:

3n³ whose order is O(n³);
n² whose order is O(n²);
9 whose order is O(1).

Since O(n³) » O(n²) » O(1), we can say that g(n)=O(f(n³))=O(n³). Therefore, g(n) is a Cubic function.

Use the Big-O notation with algorithms

The Big-O notation can be applied on the mathematical expressions provided by the RAM model to get a rough estimation of the asymptotic behaviour of such function, which translates also in an estimation of the algorithm time complexity when the input size increases.

Taking back the max_element algorithm, we can say that:

the worst-case complexity is O(n) since the function g(n)=5n-2 ≈ O(n);
the best-case complexity is also O(n) since the function g(n)=3n ≈ O(n);
since both the worst-case and the best-case complexities are O(n), also the average-case one is O(n).

We can classify the max_element algorithm with a O(n) time complexity, meaning that the time required by the algorithm to conclude grows linearly with the input size.

Note: not always an algorithm shares the same complexity among all the cases. Therefore, when analyzing algorithms, the worst-case complexity is actually the only one taken in consideration.

Logarithmic algorithms

A special class of algorithms is composed by all the algorithms whose time complexity grows logarithmically with the input size. The reason is simple: such algorithms are the quickest ones, especially when speaking of very large data pools.

Let’s make a very simple calculation. Assuming to have a 2GHz CPU and assuming that each instruction requires exactly 1 clock cycle, we could say that a single instruction takes ~5ns. What happens if we increase the input size?

n	O(log(n))	O(n)
10²	~35ns	~500ns
10^9	~60ns	~5s
10^15	~1µs	~12days

It becomes evident how such algorithms become crucial when high-performance is required. The binary search algorithm is an example of O(log(n)) algorithm.

Example: Merge sort

Let’s evaluate the complexity of the merge sort, one of the most used sorting algorithms:

def merge_sort(array, start, end):
    # phase 1: splitting O(log(n))
    n = end - start + 1                      
    if n > 1:
        mid = start + (n // 2)              
        merge_sort(array, start, mid - 1)
        merge_sort(array, mid, end)

        # phase 2: merging O(n)
        merge(array, start, mid, end)

def merge(array, start, mid, end):
    left = array[start:mid]
    right = array[mid:end+1]
    l = r = 0                               
    a = start                               
    while l < len(left) or r < len(right): 
        if l == len(left):                  
            array[a] = right[r] 
            r += 1                          
        elif r == len(right):
            array[a] = left[l] 
            l += 1                          
        elif left[l] < right[r]:
            array[a] = left[l] 
            l += 1                          
        else:                               
            array[a] = right[r]
            r += 1
        a += 1

    return array

This is a recursive algorithm which has 2 phases (Divide and Conquer):

the divide phase recursively halves the input array until each subsection has only one element;
the conquer phase merges the left and right subsections in a sorted array.

The merge phase has O(log(n)) complexity, but why?

def merge_sort(array, start, end):
    n = end - start + 1                     # 3 step (add + subtract + assign)
    if n > 1:                               # 1 step (cmp)
        mid = start + (n // 2)              # 3 steps (divide + add + assign)
        merge_sort(array, start, mid - 1)   # log(n) recursions
        merge_sort(array, mid, end)         # log(n) recursions

The first 3 lines have a constant cost of 7 steps, no matter how big the array is. However, by increasing the size of the array we incur in more recursions: more specifically we recurse until the halving generates an array of size 1, e.g. we recurse exactly log(n) times creating a binary tree from the input array.

Overall the first phase performs 2\7*log(n) steps (the “2” factor comes from the fact that we call the merge_sort twice at each recursion, once for the left side and once for the right side). In Big-O notation the function 14*log(n) belongs to the O(log(n)) class\.*

The conquer phase has instead O(n) complexity:

def merge(array, start, mid, end):
    left = array[start:mid]                 # n steps (array copy, worst case)
    right = array[mid:end+1]                # n steps (array copy, worst case)
    l = r = 0                               # 1 step (assign)
    a = start                               # 1 step (assign)
    while l < len(left) or r < len(right):  # n loops (worst case)
        if l == len(left):                  # 1 step (cmp)
            array[a] = right[r]             # 2 steps (access + assign)
            r += 1                          # 1 step (add)
        elif r == len(right):               # 1 step (cmp)           
            array[a] = left[l]              # 2 steps (access + assign)
            l += 1                          # 1 step (add)
        elif left[l] < right[r]:            # 3 steps (access + access + cmp)
            array[a] = left[l]              # 2 steps (access + assign)
            l += 1                          # 1 step (add)
        else:                               
            array[a] = right[r]             # 2 steps (access + assign)
            r += 1                          # 1 step (add)
        a += 1                              # 1 step (add)

    return array

The function performs 2n+2+9n steps in the worst-case scenario, which in Big-O notation belongs to the O(n) class.

Since each merge_sort call reflects in a call of the merge function, we can conclude that this algorithm has O(n\log(n))* complexity:

O(log(n)) x O(n) = O(n\log(n))*

Conclusion

Classifying algorithms for their performance is crucial for software engineers to pick the best algorithm for a given scenario. This post explains how the algorithms can be classified through the combination of the RAM model and the Big-O notation.

While the RAM model provides a way to translate any algorithm into a mathematical expression, the Big-O notation model allows to come with a classification that quickly and effectively gives engineers with an idea of how performant that algorithm is.