Bubble Sort

Rick HanleyRick Hanley
6 min read

There’s a beauty to this little algorithm. I learnt a ton from it. It helped me grasp the nature of nested loops and how to be careful about accessing a loop within its bounds.I still screw up with loops all the time but this process gave me some good de-bugging insight. It’s also good because I picked up a few efficiency tricks for loops and it got me thinking about how to save time in within loops. I really need to stop typing the word loops. For a small subject that doesn’t have a wow factor, it’s actually a great learning experience. And if you’re really new to this and find this hard if not impossible - so did I not that long ago. Have a go, then put it down and come back to it if it’s not sinking in.

Bubble sort works by looping over the array and it is set up to compare adjacent numbers within the array. It checks which one is larger and swaps the numbers’ positions if required. If you could watch an array element’s progress as it is sorted you would see the larger numbers “bubble” to the right of the array, where all the higher number live. Smaller numbers make their way to the left. No numbers move more than 1 array index at a time.

The key thing to note about this is that to fully sort the array, you need to start from 0 each time the inner loop begins afresh. This is because whilst large numbers end up in the correct position after length - 1 iterations of the inner loop, small numbers don’t, but we’ll get into that.

Here’s how I’ve drilled myself to recreate the bubble sort in C:

#include <stdio.h>
#include <stdbool.h>

int main(void){

    int arr[] = {8,2,6,1,4,0};
    bool swaps = false;
    int length = sizeof(arr) / sizeof(int);
    int temp = 0;

    for(int i = 0 ;i < length; i++){
        swaps = false;
        for(int j = 0; j < length - i - 1; j++){
            if(arr[j + 1] < arr[j]){
                temp = arr[j];
                arr[j] = arr[j + 1];
                arr[j + 1] = temp;
                swaps = true;
            }
        }
        if (swaps == false){
            break;
        }
    }
    for(int i = 0; i < length; i++){
        printf("%d\n", arr[i]);
    }
    return 0;
}

Our set-up includes the following:

#include <stdio.h>
#include <stdbool.h>

int main(void){

    int arr[] = {8,2,6,1,4,0};
    int swaps = false;
    int length = sizeof(arr) / sizeof(int);
    int temp = 0;
  • Our array to play with, initialised with a few numbers.

  • A swaps variable set to false. Very useful for saving time (more on that later)

  • Being C, we need to work out the length of our array manually.

  • A temp variable so we can perform swaps.

From there we’re straight into the meat of it:

for(int i = 0 ;i < length; i++){
        swaps = false;
        for(int j = 0; j < length - i - 1; j++){
            if(arr[j + 1] < arr[j]){
                temp = arr[j];
                arr[j] = arr[j + 1];
                arr[j + 1] = temp;
                swaps = true;
            }
        }

Our nested loops are set up with the customary i and j counter variables. The outer loop needs to run through the array from start to finish length times.

The inner loop needs to run through the array length - 1 - i times.

There’s actually a lot going on in that statement and here’s what it means.

Firstly, the - 1 is needed because we use j + 1 when accessing the array inside the loop. By running the loop - 1 times we prevent j checking out of bounds indexes.

The - i part is another cool time saving measure as it prevents the inner loop checking numbers that have already been sorted. For each increase of i we know that that many elements of the array have been sorted. So therefore we don’t need to run the inner loop length - 1 times we just need to run it length - 1 - i times. If our array is length 10 and i is currently 2 it means positions 9 and 8 (zero indexing remember!) are sorted. So 10 - 2 is 8 (I’m so good at maths). Next we subtract 1 to prevent out of bounds access ( j + 1 looks ahead 1 index remember!). This all adds up to the inner loop on this specific iteration running 7 times. No more no less.

The if statement checks which array index contains the larger number. It them swaps them if required using a temp variable to facilitate.

If the if statement evaluates to true and the swapping happens, then the variable swaps is also set to true. This is also a nice time saving measure because later on we check the status of swaps. If swaps == false it,

if (swaps == false){
            break;
        }

unsurprisingly means no swaps happened and as soon as that is the case, it means the array is fully sorted. This might not immediately be obvious as to why this is the case. If you do struggle as I did, maybe write out the program for a small array and increment the loop counters and variables, writing everything on paper as you go. With notes on how everything is changing you may notice how the original unsorted array can become sorted earlier than you might expect and you can see why you’d want to break there.

Finally, take note of exactly where this if / break is, within the nested loops as if it’s in the wrong place (easily done) the code won’t work as you want.

If I’m having a hard time on some coding task I often crank out bubble sort to make me feel like I actually do know something. Often the code doesn’t work, and then I spend 10 minutes debugging, cursing, and re-learning something I’d forgotten. Then I question my life choices for a bit and get back to it. It’s a process.

Big 0

Bubble sort is not efficient for large data sets. It does masses of compares and masses of swaps. It’s like your dog that just barks at stuff. It’s just the way it is. We’ve got a few timesaving measures built in here but it’s not great for anything other than small datasets where implementing a fancy sort is overkill.

We are firmly in N² territory. If you have a 10 element array bubble sort will run the inner loop 9 times, then 8 times then 7 times, then… you get the idea. for a 10 element array there’s potential to make 45 checks and 45 swaps. And if you had a 10 element array and didn’t have the the little if / break clause in there it absolutely would make your computer sit there wasting electrons and time on checks that aren’t needed.

Bubble sort is less efficient than selection sort because it only ever moves values one index position at a time and in all likelihood moves the same number again on the next iteration in its effort to get it to the proper position in the array. Selection sort does lots of comparisons but when it moves a value, it moves with purpose and it’s done in one go. It’s a blue belt to bubble sort’s white with a stripe. But it’s still cool because a) it works and b) like most things on a computer when they work, it kind of feels like magic.

0
Subscribe to my newsletter

Read articles from Rick Hanley directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rick Hanley
Rick Hanley