Design an Image Moderation Service (NSFW Detection)

System Design
Hard
Meta
126.4K views

Design a system that uses machine learning models to automatically detect and flag inappropriate images/videos upon upload. Focus on asynchronous processing and human review queues.

Why Interviewers Ask This

Interviewers at Meta ask this to evaluate your ability to balance high-scale system reliability with critical safety requirements. They specifically assess how you handle asynchronous processing for media-heavy workloads, manage trade-offs between model latency and accuracy, and design robust human-in-the-loop review queues for edge cases that automated systems miss.

How to Answer This Question

1. Clarify Scope: Immediately define constraints like image resolution, expected throughput (e.g., millions per second), and the specific definition of 'inappropriate' content. 2. High-Level Architecture: Propose a decoupled architecture using an object storage layer (like S3) feeding into an event queue (Kafka) to handle spikes in upload traffic. 3. Asynchronous Processing Pipeline: Detail the flow where workers pull messages from the queue, invoke lightweight pre-filters followed by heavy ML models, and store results back to the database. 4. Human Review Integration: Design a fallback mechanism where low-confidence predictions or specific categories trigger a ticket in a human moderation dashboard with priority queuing. 5. Scaling & Optimization: Discuss horizontal scaling strategies for worker nodes, model versioning, and caching frequent requests to reduce latency while maintaining strict data privacy standards.

Key Points to Cover

  • Explicitly separating ingestion, processing, and review layers to prevent bottlenecks
  • Using a message queue to decouple upload traffic from compute-intensive ML inference
  • Implementing a confidence threshold logic to route uncertain cases to human reviewers
  • Addressing the need for horizontal scaling of worker nodes to handle variable load
  • Considering data privacy and model retraining pipelines as part of the lifecycle

Sample Answer

To design a scalable Image Moderation Service, I would start by defining the core requirement: processing millions of uploads asynchronously without blocking the user experience. The system begins when a user uploads an image to our Object Storage layer. This triggers an event sent to a high-throughput message broker like Kafka, ensuring we can absorb traffic spikes during viral events. A fleet of stateless worker services consumes these messages. We implement a tiered detection strategy: first, a fast heuristic filter removes obviously safe or malicious files; then, a deep learning model analyzes the remaining images for NSFW content. If the model confidence is below a threshold, say 80%, the image is routed to a human review queue rather than being auto-approved or rejected. This ensures safety without false positives ruining user trust. We need a dedicated database schema to track image status, model version used, and reviewer ID. To handle scale, workers auto-scale based on queue depth, and we use Redis to cache recent image hashes to prevent redundant processing. Finally, we must ensure the human review interface prioritizes urgent content, allowing moderators to act quickly on flagged items while the system continues processing the backlog efficiently.

Common Mistakes to Avoid

  • Proposing synchronous processing which would cause unacceptable latency for users uploading images
  • Ignoring the human review component entirely, assuming AI can achieve 100% accuracy
  • Failing to address how the system handles sudden traffic spikes or bursty upload patterns
  • Overlooking the importance of storing metadata about why an image was flagged for audit trails

Practice This Question with AI

Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.

Start Practicing

Related Interview Questions

Browse all 150 System Design questionsBrowse all 71 Meta questions