Design a System for Geo-Distributed Data Storage

Question

Accepted Answer

To design a geo-distributed storage system for a global product like Apple Maps or iCloud, we must prioritize low-latency reads while managing complex write conflicts. I would start by selecting an AP system like DynamoDB Global Tables for high availability, or CockroachDB if strong consistency is non-negotiable for financial data. For most consumer applications, eventual consistency is acceptable, allowing us to replicate data asynchronously across three major regions: US-East, EU-West, and Asia-Pacific. The critical challenge is handling multi-master conflicts where users in Tokyo and New York update the same record simultaneously. We should implement Vector Clocks to track causality rather than relying solely on timestamps, which can be skewed by clock drift. If two updates occur concurrently, we can use application-level logic to merge changes, such as merging contact lists, or fallback to Last-Write-Wins with a logical timestamp derived from sequence numbers. To ensure reliability, we need an anti-entropy process running continuously to reconcile divergent states. Finally, we must design for partition tolerance; if the link between regions fails, the system must remain operational locally, accepting temporary inconsistency until connectivity is restored. This approach balances Apple's focus on seamless user experiences with the technical reality of distributed systems.

Design a System for Geo-Distributed Data Storage

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

Design a Payment Processing System

Design a System for Real-Time Fleet Management

Design a CDN Edge Caching Strategy