Design a URL Shortener
Design a service like TinyURL. Discuss requirements, API design, database schema, and handling high traffic.
Why Interviewers Ask This
Meta asks this to evaluate your ability to architect scalable, distributed systems under constraints. They specifically test your understanding of hash collision handling, database sharding strategies for massive read-heavy workloads, and how to balance latency against consistency when billions of requests hit the service daily.
How to Answer This Question
1. Clarify Requirements: Define scope like 'shorten URL' vs 'redirect', estimate QPS (e.g., 10 million/sec), and storage needs for Meta's scale. 2. API Design: Propose REST endpoints like POST /shorten and GET /{code} with status codes. 3. Core Algorithm: Discuss generating unique IDs using Base62 encoding on a random string or a counter incremented across shards. 4. Database Schema: Explain storing key-value pairs in a NoSQL store like Cassandra or DynamoDB, emphasizing partitioning by user ID or hash range. 5. Scalability & Caching: Detail using Redis for hot redirects to reduce DB load and CDN integration for global low-latency access.
Key Points to Cover
- Explicitly defining non-functional requirements like QPS and latency targets before coding
- Choosing between random generation vs sequential counters based on collision probability
- Justifying NoSQL over SQL for horizontal scaling and write throughput
- Implementing a robust caching layer (Redis/CDN) to protect the backend database
- Addressing specific Meta-scale challenges like massive data volume and global distribution
Sample Answer
To design a URL shortener for Meta's scale, I'd first clarify requirements. We need high availability, sub-100ms latency for reads, and support for billions of URLs. The API would expose two endpoints: POST /shorten to create mappings and GET /{code} to redirect. For the core logic, instead of simple hashing which risks collisions at this scale, I'd use a consistent hashing approach with a global counter per shard to generate unique alphanumeric keys, encoded in Base62. This ensures the code is short and URL-safe. For persistence, a relational database won't suffice due to write throughput. I'd propose a wide-column store like Cassandra, sharded by the generated short code or user ID to distribute load evenly. Given that 90% of traffic is likely reads for popular links, a multi-tier caching strategy is critical. I'd place a Redis cluster in front of the database to cache frequent redirects, invalidating entries only on updates. Finally, to handle Meta's global traffic, we'd deploy this behind a CDN with edge caching to serve redirects from locations closest to the user, ensuring minimal latency regardless of origin.
Common Mistakes to Avoid
- Focusing too much on the algorithm while ignoring database sharding and partitioning strategies
- Assuming a single database instance can handle billions of rows without discussing scalability
- Neglecting the impact of caching on consistency and TTL management for expired links
- Forgetting to discuss error handling scenarios like duplicate URLs or code collisions
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.
Related Interview Questions
Design a Payment Processing System
Hard
UberDesign a System for Real-Time Fleet Management
Hard
UberDesign a CDN Edge Caching Strategy
Medium
AmazonDesign a System for Monitoring Service Health
Medium
SalesforceFind K Closest Elements (Heaps)
Medium
MetaShould Meta launch a paid, ad-free version of Instagram?
Hard
Meta