Design a System for Live Stream Polling/Reactions

System Design
Medium
Meta
85.1K views

Design a highly scalable, low-latency service to collect live reactions (emojis) and simple poll votes during a live video stream. Focus on high write throughput.

Why Interviewers Ask This

Meta interviewers ask this to evaluate your ability to architect high-throughput, low-latency systems under extreme write pressure. They specifically want to see how you handle real-time data ingestion, manage global distribution for live events, and ensure data consistency without blocking the video stream experience.

How to Answer This Question

1. Clarify requirements immediately: Define scale (e.g., concurrent viewers), latency targets (sub-200ms), and data types (simple counts vs. complex reactions). 2. Estimate capacity using back-of-the-envelope math to justify component choices. 3. Design the ingestion layer first: Propose a load balancer feeding into a specialized message queue like Kafka or Kinesis to buffer spikes. 4. Address aggregation strategies: Explain how to use in-memory stores like Redis with TTLs for real-time dashboards versus writing to durable storage like DynamoDB for history. 5. Discuss scaling and failure modes: Detail sharding strategies by stream ID and how to handle region failures while maintaining low latency for users globally.

Key Points to Cover

  • Prioritizing write throughput over read consistency for real-time features
  • Using a message queue to decouple ingestion from processing logic
  • Implementing edge caching to reduce latency and origin load
  • Designing for horizontal scalability through key-based sharding
  • Distinguishing between transient real-time views and durable historical storage

Sample Answer

To design a scalable live polling system, I would start by defining non-functional requirements: we need sub-200ms end-to-end latency and support for millions of writes per second during viral moments. For ingestion, clients would send reaction requests to a global load balancer, which routes traffic to edge nodes running our proprietary caching layer. These edge nodes aggregate local counts before pushing batches to a distributed message broker like Kafka. This decouples ingestion from processing, allowing us to absorb massive write spikes without crashing the database. For real-time display, we'd use a pub/sub model where aggregated updates are pushed via WebSockets to connected clients. To handle durability, we'd asynchronously persist these aggregates to a wide-column store like Cassandra, partitioned by StreamID to ensure isolation. If a specific stream goes viral, we can dynamically add more partitions. We must also consider rate limiting at the edge to prevent abuse. Finally, for polls, since votes are often one-time actions, we can treat them as high-priority writes to ensure they appear instantly. This architecture balances the need for immediate feedback with the durability required for post-stream analytics.

Common Mistakes to Avoid

  • Focusing too much on database schema details instead of the data flow pipeline
  • Ignoring the difference between handling a small test event versus a viral broadcast
  • Proposing synchronous database writes for every single reaction click
  • Forgetting to mention how to handle client disconnections or network jitter
  • Over-engineering complex consensus algorithms when eventual consistency is sufficient

Practice This Question with AI

Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.

Start Practicing

Related Interview Questions

Browse all 150 System Design questionsBrowse all 71 Meta questions