Top K Frequent Elements (Heap/Bucket Sort)

Question

Accepted Answer

To solve this efficiently at Netflix scale, I would avoid full sorting since it hits O(n log n). First, I'd use a hash map to count word frequencies in O(n) time.

Next, I need to select the top k without sorting everything. If k is small compared to n, a Min-Heap of size k is ideal. I iterate through the frequency map; for each item, I push it onto the heap. If the heap size exceeds k, I pop the minimum element. This ensures the heap always holds the k largest frequencies seen so far. The final result is the heap contents. This approach runs in O(n + k log n).

Alternatively, if the maximum frequency isn't huge, I could use Bucket Sort. I create an array where index i holds all numbers with frequency i. Then, I iterate backwards from the max frequency bucket to collect the top k elements. This achieves strict O(n) time complexity. Given Netflix's streaming data volume, the Heap approach is often safer as it doesn't depend on the distribution of frequency values, but I'd choose Bucket Sort if memory permits and frequency ranges are predictable.

Top K Frequent Elements (Heap/Bucket Sort)

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

How do you implement a queue using two stacks?

Find K Closest Elements (Heaps)

Convert Binary Tree to Doubly Linked List in Place