Divide-and-Conquer:

The divide and conquer algorithm is one of the design techniques in the algorithms. This technique divides the original problem into several subproblems that are similar to the original problem in nature but smaller in size. These subproblems are solved recursively(calling the same function to deal with closely related subproblems). Then the solution of these subproblems is combined to create the solution of the original subproblem. The steps can be described as

Divide the problem into a number of problems that are similar instances of the same problem.
Conquer the subproblem by solving them recursively, if the subproblem is small enough, solve them in a straightforward manner.
Combine the solution to the subproblems into the solution for the original problem.

Recurrences:

Recurrences go hand-in-hand with the divide-and-conquer. Recurrence is an equation or inequality that describes the function in terms of its value on smaller inputs. Once the algorithm is written using the divide-and-conquer strategy, we write the recurrence relation of that algorithm and then solve it to get its time and space complexity. Thus the main purpose of writing the recurrence relation is to analyze the algorithm and find out its space and time complexity. There are three main ways to solve the recurrence relations. Substitution method, Recursion tree method, and masters theorem. We won't go into detail about how these methods work as it is not the focus of the topic.

Maximum Subarray Problem:

Let's suppose we are given an opportunity to invest in a corporation whose stock prices vary with time. We are allowed to buy one unit of stock only at one time and sell it at a later date, we can buy and sell after the day has been closed for trading. To compensate for the restrictions, we are allowed to see the price of the stock in the future. Our goal is to maximize profit. The stock value for the future is shown in the form of a graph and table below:

Screenshot (128).png

It's obvious that we want to buy low and sell high to get the maximum profit. Unfortunately, we will not be able to buy high and sell low, because when we buy the stock at some day, we can sell it in future, we can't go in past, for example, lowest price in the table occurs after day 7 and highest price occurs before day 7, which is day 1, which isn't possible.

Another strategy could be to find the highest and lowest prices in the table, then take the lowest price and work on the right from the lowest price to find the highest later price and take the difference and store it somewhere. Next, take the highest price and work on the left to find the lowest prior price on the left, and take the difference and store it somewhere. Take the biggest difference which will be the maximum profit. In our example, lowest_price = 63, highest_price = 113. The highest later price from the right of lowest_price = 106. First_difference= 43, The lowest prior price from the left of highest_price = 100. Second_Difference = 13. Maximum_difference = 43, which is indeed the highest profit, but this strategy doesn't always work, and here is the counter-example.

Screenshot (129).png

highest_price = 11, lowest_price = 6, lowest_prior_price = 10, first_difference = 1, highest_later_price= Null, because there is nothing on the right side of 6, thus maximum_profit = 1, which isn't the highest. If we look at the table, the highest profit is 3, if we buy at the end of day 2 and sell it at the end of day 3. Thus this counter-example is proof that the strategy doesn't always work.

Brute Force

The brute force of this problem is trying every possible pair of the buy and sell dates. If I assume that 'n' days have been given, then choosing 2 days from n days will have ${n \choose r}$ combinations, which would be O(n^2). If I try to implement it using for loop, I will have to use two for loops, the first for loop will generate all the indexes, and the second for loop, which will be inside the first for loop will generate indexes that are on the right of first for loop index. After getting the pair we have to just calculate the difference, which will take a constant amount of time. The total time complexity of this algorithm will be O(n^2). You can access the code for this brute force approach implemented in python by clicking the following link.

%[https://github.com/jibran-mohammad/Algorithms-and-Datastructures]

Transformation

Let's look at the input slightly differently. We want to find out the sequence of days over which the net change from the first to the last day is maximum. Instead of looking at daily price, let's look at the daily change in the price, where the change in day 'i' is the difference between prices after the day 'i-1' and day 'i' end. Thus we want to find the non-empty, contiguous subarray of array 'A' whose value has the largest sum, we can call this contiguous subarray a maximum subarray, for example in the following subarray, the maximum subarray is A[8-11], with sum 43. Thus we would buy a stock just before day 8 and sell it after day 11, earning a profit of $43 per share.

Screenshot (130).png

At first this transformation is not helpful because we still have to generate all the indexes and assuming when we generate an index, the calculation of sum would take constant time, which isn't always true because if I take indexes 1 and n, calculating the sum would take O(n) time, but we can use another trick to calculate the sum in constant time. Even if we calculate the sum within constant time, still generating the indexes would take O(n^2) time, which isn't better than the previous solution

Solution using divide and conquer

Suppose we want to find the maximum subarray of array A[low - high]. Divide and conquer suggests that we divide the array into two subarrays of almost equal length. We find the midpoint, say, 'mid' of the array, and consider the subarray A[low-mid] and A[mid+1 - high]. Any contiguous subarray of array A[low-high] must lie in exactly one of the following places

Entirely in the subarray A[low-mid] such that low <= i <= j < mid
Entirely in the subarray A[mid+1 - high] such that mid < i <= j <= high
Crossing the midpoint such that low <= i <= mid < j < high

Maximum subarray of A[low-high] will be the maxim subarray among these three subarrays. The two subarrays are the smaller instances of the original problem which we can solve recursively. The other subarray would become part of the merge solution of divide and conquer, and finding the maximum subarray that crosses the midpoint is the only step we need to figure out.

The subarray crossing the midpoint is itself made of two subarray A[i-mid] and A[mid+1 - j], where low<=i<=mid and mid < j <= high. Thus we need to find the maximum subarray of form A[i --- mid] and A[mid+1 --- j] and then combine them. We can find both the subarrays and combine them in time which is linear to the original array size.

To find the maximum subarray of form A[i --- mid], we will create a variable sum holding the sum of the entries in A[i --- mid]. Initially, we'll assign it to zero. We'll also create another variable named left-sum which will hold the greatest sum so far. Initially, it will be '-infinity. Another variable called left will be created which will hold the index of the array representing the subarray from left to mid with the greatest sum. We'll run a for loop which will start from mid and will keep decrementing itself till low, in every iteration, we are going to add the current item to the variable sum and then we'll check whether the left-sum is greater than the sum if it is then we're going to assign sum to left sum and assign the index of for loop to left and we'll continue till for loop is over. Finally, I would have A[left - mid] maximum subarray with the sum in variable left-sum.

To find the maximum subarray in A[mid + 1 -- high], we'll similarly create sum, right-sum, and right variable and assign them to 0, '-infinity' initially. Then for loop will start with mid+1 and work its way up to the highest. At every iteration, it will add the current item to the sum variable and then check whether the right sum is greater than the sum, if it is, then it will update 'right-sum' to sum and the right variable to the index of the current element. Finally, I would have A[mid + 1 -- right] maximum subarray with the sum in variable right-sum.

Now the maximum subarray crossing the midpoint would be A[left - right] with sum equal to left-sum + right-sum

Total number of iterations in this computation would be O(n), assuming there are 'n' elements in the original subarray. Thus the Merge solution step of divide and conquer would take O(n) time

We can divide the array into two subarray's simply by adding left and right index and dividing it by 2 and assigning it to mid, and then calling the procedure recursively by passing low, mid with one recursive call and again calling the procedure with another recursive call by passing mid +1, high.

Analysis

We have two recursive calls having half the size of the original problem and one call to a procedure that calculates the subarray crossing midpoint which takes O(n) time. Thus the recurrence relation for this procedure is given below:

Screenshot (132).png

Applying the master's theorem to this recurrence relation, will yield the time complexity of this algorithm to be O(n log n). Thus Divide and Conquer have yielded an algorithm that is asymptotically better than brute force. If you want to have a look at the implementation of this algorithm in python, click on the link below.

%[https://github.com/jibran-mohammad/Algorithms-and-Datastructures/tree/master/Divide%20and%20Conquer]

There is a better way to solve the maximum subarray problem which would only take O(n) time, but here our purpose was to demonstrate, how the divide and conquer paradigm works.

Divide and Conquer(Maximum Subarray Problem)

Table of contents