Binary Search - perfect_0

Dhruv ChouhanDhruv Chouhan
7 min read

Binary search is one of the most important algorithms of the world.

This guide is for complete beginners and will help you understand and implement the binary search algorithm in code. It will explain the intuition behind binary search and teach you how to implement it in the correct way, since it is also one of those algorithms which is often prone to mistakes while implementing it !

Example 1

Suppose someone asked you the square root of 69. You don't have a calculator. You'd probably start figuring it out in this manner :-

We take a random guess 5.

  1. The square of 5 is 25, so the ans is bigger then 5. No need to check for 1, 2, 3, 4 since their squares would obviously be smaller than 5.

  2. The square of 10 is 100, so the answer is smaller than 100. No need to check for numbers larger than 10.

  3. The square of 8 is 64 and the square of 9 is 81, so the answer is between these two numbers.

The actual answer is 8.3.... but you came pretty close to the original answer, just by guessing and eliminating the portion which could never be the answer.

This is the fundamental approach behind using binary search to find solutions to problems, you guess an answer, and reduce the searching space of the problem.

Notes and Observation

  1. There were certain properties you knew (you might have not noticed) before you applied this technique.

  2. You knew that the squares of numbers always increase as the numbers increase.

  3. The square function is a MONOTONOUS function. It always increases as the input increases. (for non-negative numbers).

  4. We were able to select and eliminate stuff since the function was MONOTONOUS.

Apply binary search only and only if the function we choose is MONOTONOUS. ( Increasing or Decreasing ).

Now let's look at a common question: Find the smallest number's index greater than or equal to a target in a sorted array.

Sorted array is a monotonic space, since the values are increasing only.

Example 2

int a[] = {1, 1, 2, 2, 3, 3, 3, 3, 5, 5, 5, 6, 10, 10, 11};

Suppose we need to find 6. We randomly choose the middle index. Here it's value is 3 which is smaller than 6 (target value) so there is no need to search to the left of it.

We reduced the search space.

We can assign a T or F value according to our chosen condition on the search space.

Let a function be as follows :

  • choose an index.

  • If the number at the index < target -> return false

  • If the number at the index >= target -> return true

Now consider this array and the problem statement:

Find the smallest number greater or equal to target=3 in the sorted array and return its index.

Here we can perform our search easily. When our function yields False, we move to the right. If it yields true, we mark the current position as a possible answer and move to the left to search for a better answer.

Implementation

int arr[] = {1, 1, 2, 2, 2, 3, 3, 4, 5, 6, 6, 7, 8, 8, 9};
int arraySize = 15;
int target = 3;

// The starting and ending positions of the search space
// Both index limits (low and high) are inclusive, that means
// The answer can exist in the inclusive interval [low, high] 
int low = 0;
int high = arraySize-1;

// We will choose the middle element for checking
int mid;

// This will store the final answer.
// I'll explain why I have assigned ans=arraySize below.
int ans = arraySize;

// This is our function for mapping the search space into True or False
bool isGreater(int index, int target) {
    if(arr[index] >= target) return true;
    else return false;
}

// Keep searching while the search space is valid.
// Search space goes invalid when low > high
while(low <= high)
{
    // Calculate the middle element
    mid = (low + high)/2;
    if(isGreater(mid, target)) {
        // Mark the possible answer.
        ans = mid;
        // Change search space to [low, mid-1]
        // So that we can search for a better answer.
        high = mid - 1;
    } else {
        // Change search space to [mid+1, high]
        low = mid + 1;
    }
}

if(ans == arraySize) {
    // target is not found.
    std::cout << "target not found" << std::endl;
} else {
    // target found at index ans.
    std::cout << "target at index " << ans << std::endl;
}

Why have we assigned the ans variable initially to arraySize ?

Suppose all the elements in the array are smaller, so our mid pointer will keep moving to the right of the array because it will always encounter the following code.

else {
    low = mid + 1;
}

Now we need a way to find out if the answer has been updated or not. So we assign it the value of arraySize and then later we can check after the algorithm that if
ans == arraySize then it has not been updated and the answer does not exist for this problem. The initial value ofansis just a value which will tell us that the answer did not exist in the array, and the low pointer has reached the end.

Review Steps

  1. Find a function which maps each element to T or F monotonic space. (TTTFFF) or (FFFFTT) type.

  2. Initialize your low and high pointers, they will indicate the search space.

  3. while(low<=high) calculate the middle, and check the value at middle if it is true or false. Move your low and high pointers accordingly.

  4. Keep updating the ans variable.

  5. At the end of the search ans variable will hold the value of answer, or if answer does not exist, it will hold the value assigned to it at its initialization.

Time Complexity calculation

In the worst case, the while loop will move until low > high occurs. Each time you are dividing the search space into half. At each search space reduction involving mid pointer calculation and reassigning low and high pointers, you are taking O(1) time. Suppose the search space is N in size. Try to look at it this way :

  • You first did a O(1) operation for N sized space.

  • Then for a N/2 sized space. (Second iteration)

  • Then for a N/4 sized space (Third iteration)

  • For a N / ( 2^(k-1) ) sized space (Kth iteration)

  • ... until the search space became just a single element, so N / (2 ^ (k-1)) = 1

  • N = 2 ^ (k-1)

  • log N = k - 1 (Base 2 logarithm)

  • k = log N + 1

  • k steps taken in total for the process, thus

  • Time Complexity is O(k) = O(log N)

What to look for in a problem ?

For a problem which might be solved by binary search, look for the following points

  1. Minimizing

  2. Maximizing

  3. Maximum of Minimum

  4. Minimum of Maximum

  5. Sorted Array

  6. When the constraints of the problem are huge N >= 1e9

  7. If you can find a monotonic space or a function which will give T and F in a monotonous way. (TTTTFFF) or (FFFFTTTT)

Point 7 is a necessary condition to apply binary search.

NOTE: Sometimes the calculation mid = (low+high)/2 might overflow in c++ int data type. You can use mid = (low + (high-low) / 2) or use long long int datatype. But be careful it might also overflow in long long int if the search space is around 1e18.

Practice Questions

This was the basic implementation and idea behind binary search. There are couple of easy questions provided for practice. All the best.

In case of any error / correction, kindly ping us.

By Dhruv Chouhan, Smit italiya and Aarjav Jain.

53
Subscribe to my newsletter

Read articles from Dhruv Chouhan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Dhruv Chouhan
Dhruv Chouhan

A homo sapien tasting the fruit of computer science.