Design a URL Shortener

System Design
Hard
Meta
76.7K views

Design a service like TinyURL. Discuss requirements, API design, database schema, and handling high traffic.

Why Interviewers Ask This

Meta asks this to evaluate your ability to architect scalable, distributed systems under constraints. They specifically test your understanding of hash collision handling, database sharding strategies for massive read-heavy workloads, and how to balance latency against consistency when billions of requests hit the service daily.

How to Answer This Question

1. Clarify Requirements: Define scope like 'shorten URL' vs 'redirect', estimate QPS (e.g., 10 million/sec), and storage needs for Meta's scale. 2. API Design: Propose REST endpoints like POST /shorten and GET /{code} with status codes. 3. Core Algorithm: Discuss generating unique IDs using Base62 encoding on a random string or a counter incremented across shards. 4. Database Schema: Explain storing key-value pairs in a NoSQL store like Cassandra or DynamoDB, emphasizing partitioning by user ID or hash range. 5. Scalability & Caching: Detail using Redis for hot redirects to reduce DB load and CDN integration for global low-latency access.

Key Points to Cover

  • Explicitly defining non-functional requirements like QPS and latency targets before coding
  • Choosing between random generation vs sequential counters based on collision probability
  • Justifying NoSQL over SQL for horizontal scaling and write throughput
  • Implementing a robust caching layer (Redis/CDN) to protect the backend database
  • Addressing specific Meta-scale challenges like massive data volume and global distribution

Sample Answer

To design a URL shortener for Meta's scale, I'd first clarify requirements. We need high availability, sub-100ms latency for reads, and support for billions of URLs. The API would expose two endpoints: POST /shorten to create mappings and GET /{code} to redirect. For the core logic, instead of simple hashing which risks collisions at this scale, I'd use a consistent hashing approach with a global counter per shard to generate unique alphanumeric keys, encoded in Base62. This ensures the code is short and URL-safe. For persistence, a relational database won't suffice due to write throughput. I'd propose a wide-column store like Cassandra, sharded by the generated short code or user ID to distribute load evenly. Given that 90% of traffic is likely reads for popular links, a multi-tier caching strategy is critical. I'd place a Redis cluster in front of the database to cache frequent redirects, invalidating entries only on updates. Finally, to handle Meta's global traffic, we'd deploy this behind a CDN with edge caching to serve redirects from locations closest to the user, ensuring minimal latency regardless of origin.

Common Mistakes to Avoid

  • Focusing too much on the algorithm while ignoring database sharding and partitioning strategies
  • Assuming a single database instance can handle billions of rows without discussing scalability
  • Neglecting the impact of caching on consistency and TTL management for expired links
  • Forgetting to discuss error handling scenarios like duplicate URLs or code collisions

Practice This Question with AI

Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.

Start Practicing

Related Interview Questions

Browse all 150 System Design questionsBrowse all 71 Meta questions