How to Remove Duplicates from a Sorted Array
Removing duplicates from a sorted array is a common problem asked in many coding interviews and programming challenges. This can be approached using different methods. In this article, we'll discuss two approaches to remove duplicates from a sorted array: one using set and another using the two-pointers technique.
Solution 1: Brute Force Approach (using Set)
This method uses a set to store unique elements, ensuring that duplicates are removed.
Implementation:
// Solution-1: Brute Force Approach (using Set)
// Time Complexity: O(nlogn)
// Space Complexity: O(n)
int removeDuplicates(vector<int> &arr, int n)
{
set<int> uniqueElements;
for (int i = 0; i < n; i++)
{
uniqueElements.insert(arr[i]);
}
int k = uniqueElements.size();
// Copy unique elements from the set back into the input array
int i = 0;
for (auto x : uniqueElements)
{
arr[i] = x;
i++;
}
// Return the number of unique elements
return k;
}
Logic:
Use a Set: Insert all elements of the array into a set to remove duplicates.
Get Unique Count: The size of the set gives the number of unique elements.
Copy Back to Array: Copy elements from the set back to the array.
Time Complexity: O(n log n)
- Explanation: Inserting each element into the set takes O(log n) time, resulting in O(n log n) for n elements.
Space Complexity: O(n)
- Explanation: The set stores all unique elements, potentially up to n elements.
Example:
Input:
arr = [1, 1, 2, 2, 3, 4, 4]
,n = 7
Output:
k = 4
,arr = [1, 2, 3, 4, ...]
Explanation: The array contains 4 unique elements.
Solution 2: Optimal Approach (Two-Pointers Technique)
This method uses two pointers to overwrite duplicates in place, providing an efficient solution.
Implementation:
// Solution-2: Optimal Approach (Two-Pointers Technique)
// Time Complexity: O(n)
// Space Complexity: O(1)
int removeDuplicates(vector<int> &arr, int n)
{
int i = 0; // Pointer to track the position of the next unique element
for (int j = 1; j < n; j++)
{
if (arr[i] != arr[j])
{
i++;
arr[i] = arr[j];
}
}
// Return the number of unique elements
return i + 1;
}
Logic:
Initialize Pointers: Use
i
to track the position of the next unique element andj
to traverse the array.Compare Elements: If
arr[i]
is not equal toarr[j]
, incrementi
and updatearr[i]
witharr[j]
.Return Count: The value
i + 1
gives the number of unique elements.
Time Complexity: O(n)
- Explanation: The array is traversed once.
Space Complexity: O(1)
- Explanation: No additional space is used apart from variables.
Example:
Input:
arr = [1, 1, 2, 2, 3, 4, 4]
,n = 7
Output:
k = 4
,arr = [1, 2, 3, 4, ...]
Explanation: The array contains 4 unique elements.
Comparison
Brute Force Method:
Pros: Simple and straightforward.
Cons: Inefficient for large arrays due to O(n log n) time complexity and additional space usage.
Use Case: Useful when simplicity is more important than efficiency.
Optimal Method:
Pros: Efficient with O(n) time complexity and O(1) space complexity.
Cons: Requires in-place modification of the array.
Use Case: Ideal for large arrays and when in-place operations are feasible.
Edge Cases
Empty Array: Returns 0 as there are no elements.
Single Element Array: Returns 1 as a single element is trivially unique.
All Identical Elements: Returns 1 as all elements are the same.
Additional Notes
Efficiency: The optimal approach is significantly more efficient for large datasets.
Simplicity: Despite its efficiency, the optimal approach is simple to implement.
Practicality: The optimal method is generally preferred due to its linear time complexity and constant space complexity.
Conclusion
Removing duplicates from a sorted array can be done efficiently using an in-place approach with two pointers. While the brute force method provides a straightforward solution, the optimal approach is both efficient and easy to implement, making it suitable for large datasets.
Subscribe to my newsletter
Read articles from Mahbub Alam Masum directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by