Design a Least Frequently Used (LFU) Cache
Design and implement a Least Frequently Used (LFU) cache. It should support `get` and `put` operations in $O(1)$ average time. This is one of the most complex $O(1)$ cache problems.
Why Interviewers Ask This
Google interviewers ask this to evaluate a candidate's mastery of advanced data structures, specifically the interplay between hash maps and doubly-linked lists. They assess whether you can design complex systems that maintain strict O(1) time complexity while handling multiple eviction policies like frequency tracking and tie-breaking rules for equal frequencies.
How to Answer This Question
1. Clarify requirements immediately: confirm capacity limits, behavior when keys exist versus new keys, and how to handle ties in frequency. 2. Propose the core architecture: explain the need for a Hash Map mapping keys to nodes, and another Hash Map mapping frequencies to Doubly-Linked Lists. 3. Detail the 'get' operation: describe updating the node's frequency by moving it from one list to the next higher frequency list, ensuring O(1) removal and insertion. 4. Explain the 'put' logic: cover initialization of new nodes, updating existing values, and evicting the tail of the minimum frequency list if capacity is reached. 5. Discuss edge cases: address what happens when a key is updated to a new value versus a new key being added, and verify your solution handles empty caches gracefully.
Key Points to Cover
- Explicitly mention the dual-hash-map strategy to separate key lookup from frequency grouping
- Demonstrate understanding of why doubly-linked lists are necessary for O(1) removal and insertion
- Explain the mechanism for tracking the minimum frequency without scanning all buckets
- Clarify how tie-breaking is handled using LRU logic within the same frequency bucket
- Confirm that both get and put operations strictly adhere to O(1) average time complexity
Sample Answer
To design an LFU cache with O(1) operations, we need to track both the access count and the order of access for items with the same frequency. I propose using two hash maps and a collection of doubly-linked lists. The first map, `keyMap`, will store the actual cache entries, linking each key to its corresponding node. The second map, `freqMap`, will map integer frequencies to doubly-linked lists, where each list contains all nodes with that specific frequency. When retrieving a key, we increment its frequency. This requires removing the node from its current frequency list and inserting it into the next frequency list, which are both O(1) operations with a doubly-linked list. If a new frequency bucket doesn't exist, we create it. For insertion, if the cache is full, we identify the minimum frequency using a variable, remove the least recently used item from that list (the tail), and delete it from our maps. We then add the new item to the frequency 1 list. This structure ensures that finding the min frequency is O(1) via a tracked variable, and all updates remain constant time. This approach aligns well with Google's focus on scalable, efficient system design patterns.
Common Mistakes to Avoid
- Attempting to use a simple priority queue or heap, which results in O(log N) complexity instead of O(1)
- Forgetting to update the minimum frequency tracker when a bucket becomes empty after an eviction
- Neglecting to implement the LRU ordering within frequency buckets, causing incorrect eviction among tied items
- Overcomplicating the code by not separating the key-to-node mapping from the frequency-to-list mapping
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.
Related Interview Questions
Convert Binary Tree to Doubly Linked List in Place
Hard
MicrosoftShortest Distance from All Buildings (BFS)
Hard
MetaDesign a Set with $O(1)$ `insert`, `remove`, and `check`
Easy
CiscoFind K Closest Elements (Heaps)
Medium
MetaDefining Your Own Success Metrics
Medium
GoogleProduct Strategy: Addressing Market Saturation
Medium
Google