Matrix Median
Table of contents
Problem
Given a row wise sorted matrix of size R*C where R and C are always odd, find the median of the matrix. (link)
Example 1:
Input:
R = 3, C = 3
M = [[1, 3, 5],
[2, 6, 9],
[3, 6, 9]]
Output: 5
Explanation: Sorting matrix elements gives
us {1,2,3,3,5,6,6,9,9}. Hence, 5 is median.
Example 2:
Input:
R = 3, C = 1
M = [[1], [2], [3]]
Output: 2
Explanation: Sorting matrix elements gives
us {1,2,3}. Hence, 2 is median.
Your Task: You don't need to read input or print anything. Your task is to complete the function median() which takes the integers R and C along with the 2D matrix as input parameters and returns the median of the matrix.
Solution
Brute Force Approach
Store all the matrix elements into a 1D array.
Sort the 1D array.
Return the index (m*n)/2 value. (Median)
Optimal Approach - Binary Search
This approach is akin to applying binary search on potential answers. First, we find out the answer range. Our answer will lie in between the minimum and maximum elements present in the matrix. Since each row of the matrix is sorted we can get the minimum in first column and maximum in last column. Now, we apply binary search on the answer range [min, max]. What we should do now where will our answer lies?
Since median is the mid element of the all the elements when we put them in sorted order, and number of rows (m) and number of columns (n) are odd, then m*n is also odd. So, to the left and right of the median, there will be exactly (m*n)/2.
So, if we calculate the number of elements for which median is greater, it is (m*n)/2. So, the frequency of median is greater than (m*n)/2 (Including the median). So, we calculate the frequency of each potential answer and compare it with the required frequency, which is (m*n)/2.
Initially, low = min and high = max. We move the low and high based on the following conditions
low = mid + 1
Move low to right when frequency(mid) is less than or equal to requried frequency.
We move low to right because we want a frequency which is greater than the required frequency.
high = mid - 1:
Move high to the left when frequency(mid) is greater than required.
We move high to left because we want a frequency which is near to (m*n)/2 but just greater than required.
To determine the frequency of a potential answer, we perform the following steps:
We iterate through each row and identify the upper bound of the potential answer.
Since each row is sorted, the number of elements greater than the potential answer corresponds to the index of the upper bound.
Does our binary search algorithm return a value that doesn’t exist in the matrix?
No, it doesn’t.
Consider this scenario: we have a 1D array derived from a matrix with elements like 1 1 1 3 3 3 9 9 9. Here’s the frequency of each number:
1: frequency - 3
2: frequency - 3
3: frequency - 6
4: frequency - 6
5: frequency - 6
6: frequency - 6
7: frequency - 6
8: frequency - 6
9: frequency - 9
Now, let’s perform a binary search. We start with low = 1
and high = 9
. The mid
value is 5, which has a frequency of 6. So, we move high
to mid - 1 = 4
.
Next, with low = 1
and high = 4
, the mid
value is 4, which also has a frequency of 6. So, we move high
to mid - 1 = 3
.
Then, with low = 1
and high = 3
, the mid
value is 2, which has a frequency of 3. So, we move low
to mid + 1 = 3
.
Now, with low = 3
and high = 3
, the mid
value is 3, which has a frequency of 6. So, we move high
to mid - 1 = 2
.
At this point, low = 3
remains constant and the loop breaks.
The key takeaway here is that low
always points to an element that exists in the list. It will always point to the first element in the range that has the same frequency as all the other elements in that range. In our case, the numbers 3, 4, 5, 6, 7 all have a frequency of 6, but low
ends up pointing to the first one (3), which is present in the list. Therefore, low
cannot point to a number that is not present in the list.
Low | High | Mid | Frequency of Mid | Action |
1 | 9 | 5 | 6 | Move high to mid - 1 = 4 |
1 | 4 | 2 | 3 | Move low to mid + 1 = 3 |
3 | 4 | 3 | 6 | Move high to mid - 1 = 2 |
3 | 2 | - | - | Loop breaks, low remains at 3 |
In this process, low
always points to an element that exists in the list. It will always point to the first element in the range that has the same frequency as all the other elements in that range. Therefore, low
cannot point to a number that is not present in the list.
class Solution {
int median(int matrix[][], int R, int C) {
int m = matrix.length;
int n = matrix[0].length;
int low = getMinOfCol(matrix, 0);
int high = getMaxOfCol(matrix, n-1);
int req = (m*n)/2;
while(low<=high){
int mid = (low+high)/2;
int count = getFreqOfElementsLessThan(matrix,mid);
if(count<=req){
low = mid+1;
}
else{
high = mid-1;
}
}
return low;
}
private int getFreqOfElementsLessThan(int[][] matrix, int mid){
int m = matrix.length;
int count = 0;
for(int i=0; i<m; i++){
count+=upperBound(matrix[i], mid);
}
return count;
}
private int upperBound(int[] row, int x){
int low = 0;
int high = row.length-1;
while(low<=high){
int mid = (low+high)/2;
if(row[mid]<=x){
low = mid+1;
}
else{
high = mid-1;
}
}
return low;
}
private int getMinOfCol(int[][] matrix, int col){
int min = -1;
for(int i=0; i<matrix.length; i++){
min = Math.min(matrix[i][col], min);
}
return min;
}
private int getMaxOfCol(int[][] matrix, int col){
int max = -1;
for(int i=0; i<matrix.length; i++){
max = Math.max(matrix[i][col], max);
}
return max;
}
}
Subscribe to my newsletter
Read articles from Chetan Datta directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Chetan Datta
Chetan Datta
I'm someone deeply engrossed in the world of software developement, and I find joy in sharing my thoughts and insights on various topics. You can explore my exclusive content here, where I meticulously document all things tech-related that spark my curiosity. Stay connected for my latest discoveries and observations.