Design a Twitter Feed (Sorted Set/Time Series)

Question

Accepted Answer

To design a scalable Twitter feed for billions of posts, I would prioritize low-latency reads over strong write consistency, which aligns with Meta's focus on user experience at scale. The core challenge is balancing the cost of generating a personalized timeline against the sheer volume of data. My solution uses a hybrid approach combining push and pull models. For active users, we employ a Push model where new tweets are immediately written to the sorted timelines of their followers. We utilize Redis Sorted Sets because they naturally order elements by score; here, the score is the tweet's timestamp, allowing us to fetch the most recent N items efficiently in O(log N + M) time. For inactive users or those with massive followings, we switch to a Pull model, fetching recent tweets from followed users dynamically. This prevents the 'fan-out' explosion problem. We also need to consider persistence and memory limits. Since storing a full timeline for every user is expensive, we might implement tiered storage, keeping hot data in Redis and cold data in a time-series database like Cassandra or HBase. Finally, we must handle edge cases like deleted tweets by marking them as soft-deleted within the set rather than removing them immediately to maintain performance. This architecture ensures that the feed remains responsive even as the user base grows into the billions.

Design a Twitter Feed (Sorted Set/Time Series)

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

How do you implement a queue using two stacks?

Find K Closest Elements (Heaps)

Convert Binary Tree to Doubly Linked List in Place