Design a Least Frequently Used (LFU) Cache

Data Structures
Hard
Google
104K views

Design and implement a Least Frequently Used (LFU) cache. It should support `get` and `put` operations in $O(1)$ average time. This is one of the most complex $O(1)$ cache problems.

Why Interviewers Ask This

Google interviewers ask this to evaluate a candidate's mastery of advanced data structures, specifically the interplay between hash maps and doubly-linked lists. They assess whether you can design complex systems that maintain strict O(1) time complexity while handling multiple eviction policies like frequency tracking and tie-breaking rules for equal frequencies.

How to Answer This Question

1. Clarify requirements immediately: confirm capacity limits, behavior when keys exist versus new keys, and how to handle ties in frequency. 2. Propose the core architecture: explain the need for a Hash Map mapping keys to nodes, and another Hash Map mapping frequencies to Doubly-Linked Lists. 3. Detail the 'get' operation: describe updating the node's frequency by moving it from one list to the next higher frequency list, ensuring O(1) removal and insertion. 4. Explain the 'put' logic: cover initialization of new nodes, updating existing values, and evicting the tail of the minimum frequency list if capacity is reached. 5. Discuss edge cases: address what happens when a key is updated to a new value versus a new key being added, and verify your solution handles empty caches gracefully.

Key Points to Cover

  • Explicitly mention the dual-hash-map strategy to separate key lookup from frequency grouping
  • Demonstrate understanding of why doubly-linked lists are necessary for O(1) removal and insertion
  • Explain the mechanism for tracking the minimum frequency without scanning all buckets
  • Clarify how tie-breaking is handled using LRU logic within the same frequency bucket
  • Confirm that both get and put operations strictly adhere to O(1) average time complexity

Sample Answer

To design an LFU cache with O(1) operations, we need to track both the access count and the order of access for items with the same frequency. I propose using two hash maps and a collection of doubly-linked lists. The first map, `keyMap`, will store the actual cache entries, linking each key to its corresponding node. The second map, `freqMap`, will map integer frequencies to doubly-linked lists, where each list contains all nodes with that specific frequency. When retrieving a key, we increment its frequency. This requires removing the node from its current frequency list and inserting it into the next frequency list, which are both O(1) operations with a doubly-linked list. If a new frequency bucket doesn't exist, we create it. For insertion, if the cache is full, we identify the minimum frequency using a variable, remove the least recently used item from that list (the tail), and delete it from our maps. We then add the new item to the frequency 1 list. This structure ensures that finding the min frequency is O(1) via a tracked variable, and all updates remain constant time. This approach aligns well with Google's focus on scalable, efficient system design patterns.

Common Mistakes to Avoid

  • Attempting to use a simple priority queue or heap, which results in O(log N) complexity instead of O(1)
  • Forgetting to update the minimum frequency tracker when a bucket becomes empty after an eviction
  • Neglecting to implement the LRU ordering within frequency buckets, causing incorrect eviction among tied items
  • Overcomplicating the code by not separating the key-to-node mapping from the frequency-to-list mapping

Practice This Question with AI

Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.

Start Practicing

Related Interview Questions

Browse all 154 Data Structures questionsBrowse all 87 Google questions