Design a System for Live Stream Polling/Reactions
Design a highly scalable, low-latency service to collect live reactions (emojis) and simple poll votes during a live video stream. Focus on high write throughput.
Why Interviewers Ask This
Meta interviewers ask this to evaluate your ability to architect high-throughput, low-latency systems under extreme write pressure. They specifically want to see how you handle real-time data ingestion, manage global distribution for live events, and ensure data consistency without blocking the video stream experience.
How to Answer This Question
1. Clarify requirements immediately: Define scale (e.g., concurrent viewers), latency targets (sub-200ms), and data types (simple counts vs. complex reactions). 2. Estimate capacity using back-of-the-envelope math to justify component choices. 3. Design the ingestion layer first: Propose a load balancer feeding into a specialized message queue like Kafka or Kinesis to buffer spikes. 4. Address aggregation strategies: Explain how to use in-memory stores like Redis with TTLs for real-time dashboards versus writing to durable storage like DynamoDB for history. 5. Discuss scaling and failure modes: Detail sharding strategies by stream ID and how to handle region failures while maintaining low latency for users globally.
Key Points to Cover
- Prioritizing write throughput over read consistency for real-time features
- Using a message queue to decouple ingestion from processing logic
- Implementing edge caching to reduce latency and origin load
- Designing for horizontal scalability through key-based sharding
- Distinguishing between transient real-time views and durable historical storage
Sample Answer
To design a scalable live polling system, I would start by defining non-functional requirements: we need sub-200ms end-to-end latency and support for millions of writes per second during viral moments. For ingestion, clients would send reaction requests to a global load balancer, which routes traffic to edge nodes running our proprietary caching layer. These edge nodes aggregate local counts before pushing batches to a distributed message broker like Kafka. This decouples ingestion from processing, allowing us to absorb massive write spikes without crashing the database. For real-time display, we'd use a pub/sub model where aggregated updates are pushed via WebSockets to connected clients. To handle durability, we'd asynchronously persist these aggregates to a wide-column store like Cassandra, partitioned by StreamID to ensure isolation. If a specific stream goes viral, we can dynamically add more partitions. We must also consider rate limiting at the edge to prevent abuse. Finally, for polls, since votes are often one-time actions, we can treat them as high-priority writes to ensure they appear instantly. This architecture balances the need for immediate feedback with the durability required for post-stream analytics.
Common Mistakes to Avoid
- Focusing too much on database schema details instead of the data flow pipeline
- Ignoring the difference between handling a small test event versus a viral broadcast
- Proposing synchronous database writes for every single reaction click
- Forgetting to mention how to handle client disconnections or network jitter
- Over-engineering complex consensus algorithms when eventual consistency is sufficient
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.
Related Interview Questions
Design a CDN Edge Caching Strategy
Medium
AmazonDesign a System for Monitoring Service Health
Medium
SalesforceDesign a Payment Processing System
Hard
UberDesign a System for Real-Time Fleet Management
Hard
UberFind K Closest Elements (Heaps)
Medium
MetaShould Meta launch a paid, ad-free version of Instagram?
Hard
Meta