Design a Global Rate Limiter

Question

Accepted Answer

To design a global rate limiter for a platform like Stripe, I would first clarify the requirements. We need to limit requests globally, not just per server, likely focusing on API endpoints handling payments. The goal is preventing abuse while minimizing latency impact on legitimate users.

For the algorithm, I recommend the Token Bucket approach. Unlike Leaky Bucket which smooths traffic rigidly, Token Bucket allows controlled bursts, which is critical for payment retries or initial checkout spikes. Each user gets a bucket of tokens replenished at a fixed rate.

Architecturally, we cannot rely on local memory because it doesn't scale globally. Instead, we deploy a centralized Redis cluster with multi-region replication. Client services will check token availability via atomic Lua scripts in Redis to ensure thread safety without race conditions. To reduce latency, we can use a tiered approach: allow a small local cache for immediate decisions, then asynchronously sync with Redis.

Consistency is tricky here. If we require strict global consistency, cross-region network latency might slow down responses. For most cases, eventual consistency with a slightly higher threshold is acceptable. However, for high-value transactions, we might enforce stricter checks. Finally, we must plan for Redis failure. If the store goes down, we should fail open with a conservative local limit to maintain service availability rather than blocking all traffic.

Design a Global Rate Limiter

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

Design a CDN Edge Caching Strategy

Design a System for Monitoring Service Health

Design a Payment Processing System