Design a Word Filter (Prefix/Suffix Search)

Data Structures
Hard
Uber
24.4K views

Design a data structure that supports searching for a word with a given `prefix` and `suffix` simultaneously. Consider using two Tries or a single specialized Trie.

Why Interviewers Ask This

Uber asks this to evaluate a candidate's ability to optimize for both time and space complexity in real-world search scenarios. They specifically test if you can design a data structure that handles bidirectional constraints efficiently, moving beyond basic linear scans to advanced tree manipulations or suffix-prefix combinations.

How to Answer This Question

1. Clarify requirements immediately: Ask about case sensitivity, duplicate words, and expected query volume versus insertion frequency to determine if optimization is needed. 2. Propose the brute-force approach first to establish a baseline, explaining its O(N) lookup cost per word to show your analytical foundation. 3. Introduce the optimized solution using two separate Tries (one for prefixes, one for suffixes) or a single Trie with combined keys, discussing the trade-offs in memory usage. 4. Walk through the algorithm step-by-step, detailing how you traverse the prefix trie and suffix trie simultaneously to find the intersection of valid indices. 5. Analyze the final Big-O notation for both build and search operations, explicitly mentioning how this scales for Uber's high-volume geospatial or routing data.

Key Points to Cover

  • Demonstrating the ability to balance Time vs Space complexity trade-offs effectively
  • Proposing a solution that achieves O(L) query time rather than O(N)
  • Clearly articulating the difference between a brute-force approach and an optimized Trie structure
  • Handling edge cases like overlapping indices and duplicate words in the dataset
  • Connecting the technical solution to real-world scalability needs typical of Uber's platform

Sample Answer

To solve the Word Filter problem efficiently, I would first clarify if we need exact matches or partial overlaps, assuming standard exact matching for now. A naive approach checking every word against both prefix and suffix takes O(N * L) per query, which is too slow for large datasets like Uber's ride history. Instead, I propose building two specialized Tries: one storing all words reversed for suffix lookups, and another storing words normally for prefix lookups. However, since we need the same word index for both, a more elegant solution involves a single Trie where each node stores a list of word indices that pass through it. When inserting 'apple', we add 'apple' to the prefix path and 'elppa' to the suffix path, tagging each node with the index. For a query like prefix='ap', suffix='le', we traverse the prefix trie to find nodes containing 'ap', then the suffix trie for 'le', and return the intersection of their stored index lists. This reduces search time to O(L) where L is word length, independent of the total number of words. While this increases space complexity due to redundant storage, it drastically improves query performance, which aligns with Uber's need for low-latency responses in real-time navigation and search features.

Common Mistakes to Avoid

  • Focusing only on the prefix search and ignoring the simultaneous suffix constraint requirement
  • Implementing a brute-force solution without attempting to optimize for large-scale data
  • Failing to explain the space complexity implications of duplicating data across multiple structures
  • Not clarifying whether the input list contains duplicates before designing the index storage mechanism

Practice This Question with AI

Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.

Start Practicing

Related Interview Questions

Browse all 154 Data Structures questionsBrowse all 57 Uber questions