Design a URL Shortening Service (TinyURL)
Design a service like TinyURL or Bit.ly. Discuss API endpoints, hashing algorithms (Base62 vs. MD5), collision resolution, and database scalability for read and write capacity.
Why Interviewers Ask This
Google asks this to evaluate your ability to balance trade-offs between simplicity and scalability. They specifically want to see if you can design a stateless, high-throughput system that handles massive write loads for shortening while maintaining sub-millisecond read latency for millions of users globally.
How to Answer This Question
1. Clarify requirements: Define scale (QPS), storage limits, and whether URLs are permanent or temporary. 2. Propose the core algorithm: Discuss generating unique IDs using auto-incrementing integers versus hashing, then converting them to Base62 for compactness. 3. Design API endpoints: Specify POST for creation and GET for redirection with status codes. 4. Address collision resolution: Explain how to handle hash collisions if using MD5/SHA, or why sequential IDs avoid this entirely. 5. Plan database architecture: Suggest a key-value store like Bigtable or Cassandra for horizontal scaling, emphasizing sharding strategies based on the ID prefix to distribute load evenly across nodes.
Key Points to Cover
- Prioritizing sequential ID generation over hashing to eliminate collision handling complexity
- Explaining the math behind Base62 encoding for maximum density in minimal space
- Recommending specific Google-scale technologies like Bigtable or Spanner for the backend
- Demonstrating awareness of read-heavy workloads and proposing CDN caching strategies
- Clearly defining API contracts with appropriate HTTP methods and status codes
Sample Answer
To design a URL shortener like Google's, I'd first clarify that we need to support billions of daily requests with minimal latency. For the core logic, I recommend avoiding complex hashing algorithms like MD5 due to collision risks. Instead, I propose using a distributed auto-incrementing counter to generate a unique integer for every new URL. We then convert this integer into a Base62 string using characters A-Z, a-z, and 0-9. This ensures a short, human-readable code without the computational overhead of cryptographic hashing. The API would have two endpoints: a POST request to create the mapping and a GET request to retrieve the original URL. For storage, since reads vastly outnumber writes, I'd use a distributed key-value store like Bigtable. We would shard the data based on the first few characters of the short code to ensure even distribution and prevent hotspots. To handle global scale, we'd deploy multiple regions with read replicas close to users. Finally, we must implement caching at the edge using CDNs for frequently accessed links to reduce database load significantly.
Common Mistakes to Avoid
- Focusing too heavily on cryptographic security features that are unnecessary for simple URL shortening
- Ignoring the need for horizontal scaling and suggesting a single monolithic database
- Forgetting to explain how the random string is generated without collisions
- Overlooking the importance of handling expired or deleted links in the schema design
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.
Related Interview Questions
Discuss ACID vs. BASE properties
Easy
MicrosoftDiscuss Serverless Functions vs. Containers (FaaS vs. CaaS)
Easy
AppleDesign a CDN Edge Caching Strategy
Medium
AmazonDesign a Payment Processing System
Hard
UberDefining Your Own Success Metrics
Medium
GoogleProduct Strategy: Addressing Market Saturation
Medium
Google