Design a Push Notification Delivery System
Detail the architecture for delivering push notifications to mobile devices (iOS APNS and Android FCM). Focus on service registration, token management, and throttling.
Why Interviewers Ask This
Interviewers at Uber ask this to evaluate your ability to design high-throughput, reliable distributed systems that handle external dependencies like APNS and FCM. They specifically want to see how you manage token lifecycle, prevent notification spam through throttling, and ensure delivery guarantees in a multi-tenant environment where reliability directly impacts user engagement.
How to Answer This Question
1. Clarify requirements by defining scale (e.g., millions of daily active users), latency goals, and delivery semantics (fire-and-forget vs. guaranteed). 2. Propose a high-level architecture including an API Gateway, a Notification Service, and a Message Queue like Kafka or RabbitMQ to buffer spikes. 3. Detail the registration flow: explain how client tokens are securely stored in a database indexed by user ID and device type. 4. Address throttling logic explicitly, describing rate limiting per user or region to protect upstream providers like Google FCM from being blocked. 5. Discuss error handling and retries with exponential backoff for transient failures, ensuring eventual consistency without flooding the system.
Key Points to Cover
- Explicitly define the difference between fire-and-forget versus guaranteed delivery semantics
- Demonstrate knowledge of specific provider constraints like FCM payload size limits
- Describe a concrete strategy for token invalidation and refresh cycles
- Explain how sliding window algorithms prevent upstream provider blocking
- Outline a retry mechanism with exponential backoff to handle transient network failures
Sample Answer
To design a push notification system for a platform like Uber, I would start by establishing clear SLAs, aiming for sub-second latency for ride updates while ensuring 99.9% delivery reliability. The core architecture would involve a Notification Service acting as an orchestrator. When a driver accepts a ride, the service publishes an event to a high-throughput message queue. Consumers then pick up these events, fetch the associated user's device tokens from a low-latency cache like Redis, and route them to the correct provider—APNS for iOS or FCM for Android.
Token management is critical here. We must store tokens with metadata like expiration dates and last seen timestamps, refreshing them upon app re-launch to avoid sending to dead devices. For throttling, we implement a sliding window rate limiter. If a user receives too many notifications, we queue subsequent ones or suppress them to prevent battery drain and app uninstallations. We also need robust retry logic; if FCM returns a 'device not registered' error, we immediately invalidate that token. Finally, monitoring dashboards should track delivery rates, provider-specific error codes, and queue depths to quickly identify bottlenecks during peak hours like Friday nights.
Common Mistakes to Avoid
- Ignoring the distinction between iOS and Android provider limitations, leading to unrealistic payload designs
- Failing to address what happens when a device token expires or becomes invalid over time
- Designing a synchronous request-response flow instead of using an asynchronous queue for scalability
- Overlooking rate limiting strategies which can cause the entire system to crash under load
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.