Social Media Feed System Design
How do I design a social media feed system (Twitter/Instagram clone)?
TL;DR
- Bottom line: A social media feed requires a hybrid fan-out architecture -- fan-out-on-write for normal users (precompute timelines into Redis) and fan-out-on-read for celebrity accounts (>500K followers) to avoid write amplification.
- Key tool/command:
LPUSH user:{id}:timeline {tweet_id}(Redis list per user for O(1) feed reads) - Watch out for: Unbounded fan-out for high-follower accounts -- a single celebrity tweet can generate millions of writes and lag the entire pipeline.
- Works with: Any stack; core patterns are language-agnostic. Redis + Kafka + PostgreSQL/Cassandra is the most battle-tested combination.
Constraints
- Fan-out-on-write must be disabled for users with >500K followers to avoid write amplification storms
- Timeline cache must have a TTL and size cap (e.g. 800 tweets per user) to bound memory usage
- Never store media blobs in the primary database; use object storage (S3/GCS) with CDN delivery
- Feed ranking models require A/B testing infrastructure before production deployment
- Eventual consistency is acceptable for feed delivery (seconds-level lag), but like/reply counts need at-least-once delivery guarantees
Quick Reference
| Component | Role | Technology Options | Scaling Strategy |
|---|---|---|---|
| API Gateway | Rate limiting, auth, routing | Nginx, Kong, AWS API Gateway | Horizontal scaling behind LB |
| Post Service | Create/store posts and metadata | Python/Go + PostgreSQL | Shard by user_id, master-slave replication |
| Fan-Out Service | Distribute posts to follower timelines | Go/Java workers + Kafka consumers | Partition Kafka by user_id, scale consumers horizontally |
| Timeline Service | Serve precomputed home feeds | Node.js/Go + Redis | Redis Cluster with consistent hashing |
| Social Graph Service | Manage follow/unfollow relationships | Go/Java + Graph DB or Redis | Adjacency list in Redis, Neo4j for deep queries |
| User Service | Profiles, auth, preferences | Any framework + PostgreSQL | Read replicas, cache hot profiles |
| Media Service | Upload, process, serve images/video | Python/Go + S3/GCS + CDN | Pre-signed uploads, CDN edge caching |
| Search Service | Full-text search on posts | Elasticsearch, Apache Solr | Sharded index, scatter-gather queries |
| Notification Service | Push/email/in-app alerts on engagement | Go/Python + Kafka + FCM/APNs | Async via message queue, batch delivery |
| Ranking Service | ML-based feed personalization | Python + TensorFlow/PyTorch | Feature store + model serving (TF Serving, Triton) |
| CDN | Static asset and media delivery | CloudFront, Cloudflare, Fastly | Edge PoPs, cache invalidation via purge API |
| Message Queue | Async communication between services | Apache Kafka, RabbitMQ, SQS | Partitioned topics, consumer groups |
| Cache Layer | Reduce DB load for hot data | Redis, Memcached | Redis Cluster, LRU eviction, ~800 items/timeline |
| Object Storage | Persistent media file storage | AWS S3, Google Cloud Storage | Lifecycle policies, cross-region replication |
Decision Tree
START
|-- Expected DAU?
| |-- <10K users
| | --> Simple pull-based feed: query DB at read time, paginate with cursor
| |-- 10K - 1M users
| | --> Fan-out-on-write for all users
| | --> Precompute timelines into Redis on every post
| |-- 1M - 100M users
| | |-- Any user with >500K followers?
| | | |-- YES --> Hybrid: fan-out-on-write for normal, fan-out-on-read for celebrities
| | | |-- NO --> Pure fan-out-on-write with Redis Cluster
| | |-- Feed type?
| | | |-- Chronological --> Sort by timestamp in Redis sorted set
| | | |-- Ranked --> Add Ranking Service with ML model
| | | |-- Hybrid --> Chronological base + lightweight re-ranking
| |-- >100M users
| --> Full hybrid fan-out + ML ranking pipeline
| --> Multi-region deployment with geo-sharding
| --> Edge caching for feed responses
Step-by-Step Guide
1. Define the data model and API contracts
Design the core entities: Users, Posts, Follows, Likes, Comments. Define REST or gRPC endpoints for create-post, get-timeline, follow/unfollow. [src1]
CREATE TABLE users (
user_id BIGINT PRIMARY KEY,
username VARCHAR(64) UNIQUE NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE posts (
post_id BIGINT PRIMARY KEY,
author_id BIGINT NOT NULL REFERENCES users(user_id),
content TEXT NOT NULL,
media_url TEXT,
created_at TIMESTAMPTZ DEFAULT NOW(),
like_count INT DEFAULT 0
);
CREATE TABLE follows (
follower_id BIGINT NOT NULL,
followee_id BIGINT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
PRIMARY KEY (follower_id, followee_id)
);
Verify: SELECT COUNT(*) FROM information_schema.tables WHERE table_name IN ('users','posts','follows'); --> expected: 3
2. Build the post ingestion pipeline
When a user creates a post, write it to the Posts table, then publish an event to Kafka for async fan-out. [src2]
producer = KafkaProducer(bootstrap_servers=["kafka:9092"])
def create_post(author_id, content, media_url=None):
post = db.insert_post(author_id, content, media_url)
producer.send("post-created",
key=str(author_id).encode(),
value=json.dumps({"post_id": post["post_id"], "author_id": author_id}).encode()
)
return post
Verify: kafka-console-consumer --topic post-created --from-beginning --> expected: JSON with post_id
3. Implement the fan-out service
Kafka consumers read post-created events, look up followers, and push post_id into each follower's Redis timeline. Skip fan-out for celebrities. [src3]
CELEBRITY_THRESHOLD = 500_000
for message in consumer:
event = message.value
followers = get_followers(event["author_id"])
if len(followers) > CELEBRITY_THRESHOLD:
continue # handled at read time
pipe = redis.pipeline(transaction=False)
for fid in followers:
pipe.lpush(f"user:{fid}:timeline", event["post_id"])
pipe.ltrim(f"user:{fid}:timeline", 0, 799)
pipe.execute()
Verify: redis-cli LRANGE user:42:timeline 0 9 --> expected: list with new post_id
4. Build the timeline read path
Read precomputed Redis list, hydrate post objects, merge celebrity tweets at serve time. [src2] [src3]
def get_home_timeline(user_id, cursor=0, limit=20):
post_ids = redis.lrange(f"user:{user_id}:timeline", cursor, cursor + limit - 1)
for celeb_id in get_followed_celebrities(user_id):
celeb_posts = db.query_recent_posts(celeb_id, limit)
post_ids.extend(celeb_posts)
posts = hydrate_posts(list(dict.fromkeys(post_ids)))
posts.sort(key=lambda p: p["created_at"], reverse=True)
return posts[:limit]
Verify: curl /api/v1/timeline?user_id=42&limit=10 --> expected: JSON array of posts
5. Add feed ranking
Insert a ranking layer scoring posts by engagement probability using a lightweight ML model. [src4] [src5]
def rank_feed(user_id, candidates):
features = extract_features(user_id, candidates)
scores = ranking_model.predict(features)
for post, score in zip(candidates, scores):
post["rank_score"] = float(score)
candidates.sort(key=lambda p: p["rank_score"], reverse=True)
return candidates
Verify: Compare ranked vs chronological output -- ranked feed should prioritize higher-engagement posts
6. Configure media storage and CDN
Set up pre-signed uploads to S3, configure CDN for delivery, cache feed API responses at the edge. [src1] [src6]
def generate_upload_url(user_id, filename):
key = f"media/{user_id}/{filename}"
return s3.generate_presigned_url("put_object",
Params={"Bucket": "feed-media", "Key": key},
ExpiresIn=3600
)
Verify: Upload test image via pre-signed URL, fetch via CDN --> expected: 200 OK
Code Examples
Python: Feed Generation with Hybrid Fan-Out
# Input: user_id (int), cursor (int), limit (int)
# Output: list of post dicts sorted by rank score
import redis
r = redis.Redis(host="redis-cluster", port=6379, decode_responses=True)
def generate_feed(user_id, cursor=0, limit=20):
"""Hybrid feed: precomputed timeline + celebrity merge + ranking."""
tl_key = f"user:{user_id}:timeline"
post_ids = r.lrange(tl_key, cursor, cursor + limit * 2)
celeb_followees = get_followed_celebrities(user_id)
for cid in celeb_followees:
celeb_posts = r.lrange(f"user:{cid}:posts", 0, limit - 1)
post_ids.extend(celeb_posts)
post_ids = list(dict.fromkeys(post_ids))
posts = batch_hydrate(post_ids)
ranked = rank_feed(user_id, posts)
return ranked[cursor:cursor + limit]
Python: Redis Timeline Caching
# Input: author_id (int), post_id (int), follower_ids (list[int])
# Output: None (updates Redis timelines)
import redis
r = redis.Redis(host="redis-cluster", port=6379, decode_responses=True)
MAX_TIMELINE_SIZE = 800
TIMELINE_TTL = 7 * 24 * 3600 # 7 days
def fan_out_to_timelines(author_id, post_id, follower_ids):
pipe = r.pipeline(transaction=False)
for fid in follower_ids:
key = f"user:{fid}:timeline"
pipe.lpush(key, post_id)
pipe.ltrim(key, 0, MAX_TIMELINE_SIZE - 1)
pipe.expire(key, TIMELINE_TTL)
pipe.execute()
Anti-Patterns
Wrong: Fan-out-on-write for all users regardless of follower count
# BAD -- celebrity with 10M followers causes 10M Redis writes per tweet
def fan_out(author_id, post_id):
followers = get_all_followers(author_id) # could be 10M+
for fid in followers:
redis.lpush(f"user:{fid}:timeline", post_id)
# Result: single tweet takes minutes, OOM risk
Correct: Hybrid fan-out with celebrity threshold
# GOOD -- skip fan-out for high-follower accounts, merge at read time
CELEBRITY_THRESHOLD = 500_000
def fan_out(author_id, post_id):
followers = get_all_followers(author_id)
if len(followers) > CELEBRITY_THRESHOLD:
return # handled at read time
pipe = redis.pipeline(transaction=False)
for fid in followers:
pipe.lpush(f"user:{fid}:timeline", post_id)
pipe.ltrim(f"user:{fid}:timeline", 0, 799)
pipe.execute()
Wrong: Storing media files in the database
# BAD -- blobs in PostgreSQL cause table bloat and slow queries
def save_post(content, image_bytes):
db.execute("INSERT INTO posts (content, image_data) VALUES (%s, %s)",
(content, image_bytes)) # 5MB image in a row!
Correct: Store media in object storage, reference by URL
# GOOD -- database stores only the URL reference
def save_post(content, image_file):
key = f"media/{uuid4()}.jpg"
s3.upload_fileobj(image_file, "feed-media", key)
cdn_url = f"https://cdn.example.com/{key}"
db.execute("INSERT INTO posts (content, media_url) VALUES (%s, %s)",
(content, cdn_url))
Wrong: Unbounded timeline cache with no eviction
# BAD -- Redis memory grows without bound
def add_to_timeline(user_id, post_id):
redis.lpush(f"user:{user_id}:timeline", post_id)
# No LTRIM, no TTL -- grows forever, exhausts Redis memory
Correct: Capped timeline with TTL and eviction
# GOOD -- bounded memory usage per user
def add_to_timeline(user_id, post_id):
key = f"user:{user_id}:timeline"
pipe = redis.pipeline()
pipe.lpush(key, post_id)
pipe.ltrim(key, 0, 799)
pipe.expire(key, 7 * 24 * 3600)
pipe.execute()
Wrong: Synchronous fan-out blocking the post API
# BAD -- user waits for fan-out to complete
@app.post("/api/posts")
def create_post(request):
post = db.insert_post(request.data)
for fid in get_followers(post.author_id): # blocks for seconds
redis.lpush(f"user:{fid}:timeline", post.id)
return {"status": "ok"} # user waited 5+ seconds
Correct: Async fan-out via message queue
# GOOD -- return immediately, fan-out asynchronously
@app.post("/api/posts")
def create_post(request):
post = db.insert_post(request.data)
kafka.send("post-created", {"post_id": post.id, "author_id": post.author_id})
return {"status": "ok"} # instant response
Common Pitfalls
- Hot partition on celebrity user_id: Kafka partitioning by author_id concentrates all celebrity activity on one partition. Fix:
use dedicated topic or random partitioning for high-follower accounts. [src3] - Thundering herd on timeline rebuild: When a popular user returns after inactivity, rebuilding their timeline from DB spikes load. Fix:
rate-limit rebuilds and use a circuit breaker. [src2] - Stale celebrity tweets in merged feed: Fan-out-on-read celebrity tweets fetched from lagging read replica. Fix:
query primary for celebrity posts or cache with short TTL. [src3] - Feed pagination drift: New posts during pagination cause duplicates or missing items. Fix:
use cursor-based pagination (last_seen_post_id) instead of offset. [src1] - Engagement count inconsistency: Like/retweet counts in caches show different numbers on refresh. Fix:
use Redis INCR for counters with periodic DB reconciliation. [src6] - Cache stampede on cold start: First request after cache miss triggers expensive DB query while concurrent requests pile up. Fix:
cache warming on deploy + single-flight cache population. [src7] - Over-fetching in hydration: Fetching full post objects when the feed only needs a subset of fields. Fix:
use projection queries or field selection in hydration. [src1]
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Building a follow-graph-based feed (Twitter, Instagram, LinkedIn) | Content is purely algorithmic with no follow graph (TikTok For You) | Recommendation engine with collaborative filtering |
| DAU > 10K and feed latency matters (<200ms) | Small team project or MVP with <1K users | Simple SQL query with pagination |
| Posts are short-lived and time-sensitive (news, social updates) | Content is long-form and rarely updated (blog, wiki) | CMS with traditional caching |
| Real-time or near-real-time delivery is required | Batch-processed feeds updated hourly are acceptable | Batch ETL pipeline with scheduled materialization |
| Multiple content types (text, images, video, polls) | Single content type with simple ordering | Standard list API with sorting |
Important Caveats
- Fan-out-on-write memory cost scales linearly with average follower count; at 500 avg followers and 100M users, expect ~40TB Redis if naively storing tweet IDs (mitigate with TTL + size caps + active-user-only caching)
- Ranking models introduce latency (10-50ms per request); keep model inference on the critical path only if p99 latency budget allows
- GDPR/privacy regulations require the ability to delete a user's posts from all follower timelines within a defined time window; fan-out-on-write makes this harder (you must fan-out deletions too)
- The hybrid fan-out threshold (e.g. 500K followers) is not universal; tune it based on your write throughput capacity and acceptable fan-out latency SLA
- Multi-region deployment requires choosing between global consistency (higher latency) and regional feeds (users may see different content when traveling)