Design a Distributed Counter Service

Question

Accepted Answer

To design a distributed counter service for millions of operations, I first clarify that for metrics like 'likes' or 'views', eventual consistency is usually sufficient, allowing us to prioritize availability and low latency over strict linearizability. If we required strong consistency, the system would suffer from significant write latency due to coordination overhead across nodes. My proposed architecture starts by sharding counters based on their IDs across multiple stateless API servers backed by a distributed cache like Redis Cluster. Each shard handles a specific range of keys to prevent hotspots. When an increment request arrives, the API server forwards it to the appropriate shard. To ensure atomicity without heavy locking, we use Redis's INCR command, which is atomic and extremely fast. For durability, shards asynchronously replicate data to persistent storage in the background. If a node fails, the cluster automatically reassigns keys to healthy nodes. We handle potential data loss during crashes by implementing a write-ahead log or accepting minor discrepancies given the eventual consistency model. Finally, we aggregate these counters periodically for reporting dashboards. This approach mirrors Google's preference for scalable, loosely coupled systems that optimize for user experience rather than perfect immediate accuracy.

Design a Distributed Counter Service

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

Design a CDN Edge Caching Strategy

Design a System for Monitoring Service Health

Design a Payment Processing System