Design a Distributed Unique ID Generator

Data Structures
Medium
Tesla
136.6K views

Describe the data structure required for a local service instance to generate a block of unique IDs without coordination. Focus on atomic operations and safe distribution.

Why Interviewers Ask This

Interviewers ask this to evaluate your ability to design scalable systems that avoid central bottlenecks. At Tesla, where millions of vehicles generate data simultaneously, they need engineers who understand how to distribute state safely without coordination. This tests your grasp of atomic operations, concurrency safety, and efficient data structure selection for high-throughput environments.

How to Answer This Question

1. Clarify requirements: Ask about uniqueness scope (global vs. local), latency needs, and whether IDs must be sortable or time-ordered. 2. Propose a tiered architecture: Suggest a central service assigning ID 'blocks' (ranges) to local instances rather than generating single IDs on demand. 3. Detail the local data structure: Explain using a simple integer counter initialized with the block start value and end value stored in memory. 4. Describe atomic updates: Emphasize using Compare-and-Swap (CAS) or hardware-level atomic increments to prevent race conditions when multiple threads request IDs from the same block. 5. Discuss replenishment logic: Outline how the instance detects low inventory and requests a new block asynchronously, ensuring continuous operation without blocking.

Key Points to Cover

  • Proposing a block allocation strategy to minimize coordination overhead
  • Selecting a simple integer pair data structure for local range tracking
  • Explicitly mentioning atomic operations like CAS for thread safety
  • Addressing the asynchronous replenishment mechanism for scalability
  • Ensuring non-overlapping ranges to guarantee global uniqueness

Sample Answer

To solve this for a high-scale environment like Tesla's vehicle telemetry, I would avoid a centralized database call for every ID generation. Instead, I propose a block-based allocation strategy. The system uses a lightweight local data structure: a pair of integers representing the current range [start, end] and a cursor pointing to the next available ID within that range. When an instance starts, it fetches a large block, say 10,000 IDs, from a coordination service. Locally, the instance simply increments its cursor. To ensure thread safety across multiple cores, we use atomic increment operations, such as std::atomic_fetch_add in C++ or compare-and-swap loops, guaranteeing no two threads return the same number even under heavy load. If the local cursor reaches the block limit, the instance triggers an asynchronous refill request to the coordinator. This approach minimizes network chatter, reduces latency to microseconds, and ensures global uniqueness because the coordinator strictly assigns non-overlapping ranges. This pattern is critical for systems requiring high throughput, such as managing unique identifiers for autonomous driving events or manufacturing parts.

Common Mistakes to Avoid

  • Suggesting a database sequence or UUID for every request, which creates a performance bottleneck
  • Ignoring multi-threaded access and failing to mention atomic operations or locks
  • Forgetting to handle the edge case where the local ID pool is exhausted
  • Overcomplicating the solution with distributed consensus algorithms like Paxos when simpler range assignment suffices

Practice This Question with AI

Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.

Start Practicing

Related Interview Questions

Browse all 154 Data Structures questionsBrowse all 29 Tesla questions