Design an Image Hosting Service (Instagram/Flickr)
Design a service to store and retrieve billions of images. Focus on file storage (S3/BLOB), image processing/resizing, and using Content Delivery Networks (CDNs).
Why Interviewers Ask This
Interviewers at Meta ask this to evaluate your ability to architect scalable systems handling massive unstructured data. They specifically test your understanding of storage trade-offs, the critical role of CDNs in latency reduction, and how to design efficient image processing pipelines that decouple heavy compute from user requests.
How to Answer This Question
1. Clarify requirements: Define scale (billions of images), read/write ratios, and specific features like resizing or filters. 2. High-level architecture: Propose a client uploading to an API gateway, which triggers a worker queue. 3. Storage strategy: Recommend object storage like S3 for raw images, emphasizing durability over database storage. 4. Processing pipeline: Design an async workflow where workers resize images into multiple thumbnails upon upload. 5. Delivery optimization: Explain using CDNs with cache invalidation strategies to serve global users quickly. 6. Scalability: Discuss sharding strategies for metadata databases and auto-scaling worker groups to handle traffic spikes typical of social media platforms.
Key Points to Cover
- Explicitly separating raw image storage (Object Store) from metadata storage (Database)
- Designing an asynchronous processing pipeline using message queues to decouple uploads from resizing
- Justifying the use of CDNs for reducing latency in a globally distributed user base
- Addressing scalability through database sharding strategies based on user IDs
- Discussing cost and performance trade-offs between different image formats and compression levels
Sample Answer
To design an Instagram-like service, we first clarify that while we store billions of images, reads vastly outnumber writes. We need low-latency retrieval globally. For storage, we should never put raw images in a SQL database. Instead, we use distributed object storage like AWS S3 or Meta's own FBCache. The key is storing metadata—user ID, URL, timestamp—in a sharded NoSQL database like Cassandra or DynamoDB. When a user uploads an image, the request hits our API Gateway, which pushes a job to a message queue like Kafka. A pool of worker services consumes these jobs to generate various thumbnail sizes asynchronously, preventing the user from waiting for heavy processing. These processed images are then stored back in S3. To ensure fast delivery, we place a CDN like CloudFront in front of S3. The CDN caches static assets at edge locations worldwide. Crucially, we must implement cache invalidation logic when a user updates their profile picture. Finally, to handle Meta-scale traffic, we shard the metadata database by user ID to distribute load evenly across nodes, ensuring no single hotspot becomes a bottleneck during viral events.
Common Mistakes to Avoid
- Storing binary image data directly in a relational database instead of using object storage
- Forgetting to mention Content Delivery Networks, leading to high latency for international users
- Attempting to process all image resizing synchronously, causing poor user experience during uploads
- Ignoring cache invalidation strategies when users update their profile pictures frequently
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.
Related Interview Questions
Design a CDN Edge Caching Strategy
Medium
AmazonDesign a System for Monitoring Service Health
Medium
SalesforceDesign a Payment Processing System
Hard
UberDesign a System for Real-Time Fleet Management
Hard
UberFind K Closest Elements (Heaps)
Medium
MetaShould Meta launch a paid, ad-free version of Instagram?
Hard
Meta