Leetcode 347. Top K Frequent Elements
I want to document my thought process on this Leetcode question, https://leetcode.com/problems/top-k-frequent-elements/.
First I'll try get an idea of what the function signature should be like while listening to the question. Then I'll start ask clarifying questions and come up with examples. After clarifying, I'll brainstorm some ideas to solve the problem. I want to try throwing different data structures at the problem for now, some may just be for the sake of considering them.
Function signature:
- int[] topKFrequent(int[] nums, int k)
Clarifying Questions / Assumptions:
Input: null/empty? k always valid? Multi answers?
Output: How to handle null/empty/Invalid k? Should sort?
Assume: k is always valid
Examples:
nums = [1, 1, 2, 2, 3], k = 2
nums = [1], k = 10 // invalid
Brainstorm/Observations:
If k is always valid and k == nums.length, return the whole array.
k <= nums.length
What would I do if I had to solve this in real life with paper and pen?
(n = nums.length)
Count frequency + Sort
Count frequency and store it as int[][] numFreq.
Sort using an altered version of quicksort.
Get k items from sorted array as result.Example
nums = [5,6,6,6,7,7,7,8,8]
numFreq = { [5,1], [6,3], [7,3], [8,2] }aftersort = { [5,1], [8,2], [6,3], [7,3] }
time O(nlogn) ~ O(n^2) // for quicksort
space O(n) // for counting frequencies
Count frequency + Max heap
Count frequency and store num and frequency pair as int[][] numFreq or HashMap<>.
Add all items to a max heap with custom comparator.
e.g. - PriorityQueue pq = new PriorityQueue<>((a, b) -> freqMap.get(b) - freqMap.get(a));
Poll k items off heap for result.time O(nlogn) // for building a max heap
space O(n) // for counting frequencies and heap
Count frequency + Min heap (maintain size at k)
Use min heap instead of max heap.
Add items to heap and poll whenever heap.size() > k.time O(nlogk) // for building a min heap capped at size k
space O(n) // for counting frequencies
Count frequency + Bucketsort
Count frequency first.
Create buckets with size "n + 1", because max bucket size will be capped at n, and the addtional 1 of for frequency 0.
Sort the numbers into buckets by frequency,
then go through the buckets till we get k items.nums = [5,6,6,6,7,7,7,8,8]
numFreq = { [5,1], [6,3], [7,3], [8,2] }minFreq = 1
maxFreq = 3buckets = {
0: 5 // frequency = 1
1: 8 // frequency = 2
2: 6, 7 // frequency = 3
}Max bucket size will be capped at n
time O(n) // for counting frequency, sort into buckets, building result
space O(n) // for counting frequency and buckets
Count frequency + QuickSelect
Count frequency, and convert it to a int[][] freq and num pair.
Use QuickSelect, to find the item at "n - k + 1".
Items to the right are the results we want, albeit not sorted.time O(n) ~ O(n^2) // for quickselect
space O(n) //
Code:
- MinHeap
class Solution {
public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Integer> freq = countFreq(nums);
PriorityQueue<Integer> pq = findTopK(freq, k);
return toArray(pq);
}
private Map<Integer, Integer> countFreq(int[] nums) {
Map<Integer, Integer> freq = new HashMap<>();
for (int n : nums) {
freq.put(n, freq.getOrDefault(n, 0) + 1);
}
return freq;
}
private PriorityQueue<Integer> findTopK(Map<Integer, Integer> freq, int k) {
PriorityQueue<Integer> pq = new PriorityQueue<>(
(i, j) -> Integer.compare(freq.get(i), freq.get(j)));
for (int i : freq.keySet()) {
pq.add(i);
if (pq.size() > k) {
pq.poll();
}
}
return pq;
}
private int[] toArray(PriorityQueue<Integer> pq) {
int[] ans = new int[pq.size()];
for (int i = pq.size() - 1; i >= 0; i--) {
ans[i] = pq.poll();
}
return ans;
}
}
- Bucketsort:
class Solution {
public static int[] topKFrequent(int[] nums, int k) {
if (nums == null || nums.length == 0 || k == 0) return new int[0];
Map<Integer, Integer> freq = countFreq(nums);
List<List<Integer>> buckets = toBuckets(freq, nums.length);
return getTopBuckets(buckets, k);
}
private static Map<Integer, Integer> countFreq(int[] nums) {
Map<Integer, Integer> freq = new HashMap<>();
for (int num : nums) {
freq.put(num, freq.getOrDefault(num, 0) + 1);
}
return freq;
}
private static List<List<Integer>> toBuckets(Map<Integer, Integer> map, int size) {
List<List<Integer>> buckets = new ArrayList<>();
for (int i = 0; i < size + 1; i++) {
buckets.add(new ArrayList<>());
}
for (Map.Entry<Integer, Integer> item : map.entrySet()) {
int num = item.getKey();
int freq = item.getValue();
buckets.get(freq).add(num);
}
return buckets;
}
private static int[] getTopBuckets(List<List<Integer>> buckets, int k) {
int[] top = new int[k];
int j = 0;
for (int i = buckets.size() - 1; i >= 0; i--) {
List<Integer> bucket = buckets.get(i);
for (int num : bucket) {
top[j++] = num;
if (j >= k) {
break;
}
}
if (j >= k) {
break;
}
}
return top;
}
}
- QuickSelect:
class Solution {
public static int[] topKFrequent(int[] nums, int k) {
if (nums == null || nums.length == 0 || k == 0) return new int[0];
int[][] freq = countFreq(nums);
quickSelect(freq, freq.length - k);
return toArray(freq, k);
}
private static int[][] countFreq(int[] nums) {
Map<Integer, Integer> map = new HashMap<>();
for (int num : nums) {
map.put(num, map.getOrDefault(num, 0) + 1);
}
int[][] freq = new int[map.size()][];
int i = 0;
for (int num : map.keySet()) {
freq[i++] = new int[] {map.get(num), num};
}
return freq;
}
private static void quickSelect(int[][] freq, int target) {
int l = 0;
int r = freq.length - 1;
while (l < r) {
int p = partition(freq, l, r);
if (p == target) {
break;
} else if (p < target) {
l = p + 1;
} else {
r = p - 1;
}
}
}
private static int partition(int[][] freq, int l, int r) {
swap(freq, r, l + new Random().nextInt(r - l));
int p = r;
int lo = l;
for (int i = l; i < p; i++) {
if (freq[i][0] < freq[p][0]) {
swap(freq, lo, i);
lo++;
}
}
swap(freq, lo, p);
return lo;
}
private static void swap(int[][] freq, int i, int j) {
if (i == j) return;
int[] tmp = freq[i];
freq[i] = freq[j];
freq[j] = tmp;
}
private static int[] toArray(int[][] freq, int k) {
int[] ans = new int[k];
for (int i = 0; i < ans.length; i++) {
ans[i] = freq[freq.length - 1 - i][1];
}
return ans;
}
}
Let me know if there is anything I should have considered, or has an error! I'm happy to get any feedback.
Subscribe to my newsletter
Read articles from Blood Runs Code directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by