Social Media Feed System Design

Type: Software Reference Confidence: 0.91 Sources: 7 Verified: 2026-02-23 Freshness: 2026-02-23

TL;DR

Constraints

Quick Reference

ComponentRoleTechnology OptionsScaling Strategy
API GatewayRate limiting, auth, routingNginx, Kong, AWS API GatewayHorizontal scaling behind LB
Post ServiceCreate/store posts and metadataPython/Go + PostgreSQLShard by user_id, master-slave replication
Fan-Out ServiceDistribute posts to follower timelinesGo/Java workers + Kafka consumersPartition Kafka by user_id, scale consumers horizontally
Timeline ServiceServe precomputed home feedsNode.js/Go + RedisRedis Cluster with consistent hashing
Social Graph ServiceManage follow/unfollow relationshipsGo/Java + Graph DB or RedisAdjacency list in Redis, Neo4j for deep queries
User ServiceProfiles, auth, preferencesAny framework + PostgreSQLRead replicas, cache hot profiles
Media ServiceUpload, process, serve images/videoPython/Go + S3/GCS + CDNPre-signed uploads, CDN edge caching
Search ServiceFull-text search on postsElasticsearch, Apache SolrSharded index, scatter-gather queries
Notification ServicePush/email/in-app alerts on engagementGo/Python + Kafka + FCM/APNsAsync via message queue, batch delivery
Ranking ServiceML-based feed personalizationPython + TensorFlow/PyTorchFeature store + model serving (TF Serving, Triton)
CDNStatic asset and media deliveryCloudFront, Cloudflare, FastlyEdge PoPs, cache invalidation via purge API
Message QueueAsync communication between servicesApache Kafka, RabbitMQ, SQSPartitioned topics, consumer groups
Cache LayerReduce DB load for hot dataRedis, MemcachedRedis Cluster, LRU eviction, ~800 items/timeline
Object StoragePersistent media file storageAWS S3, Google Cloud StorageLifecycle policies, cross-region replication

Decision Tree

START
|-- Expected DAU?
|   |-- <10K users
|   |   --> Simple pull-based feed: query DB at read time, paginate with cursor
|   |-- 10K - 1M users
|   |   --> Fan-out-on-write for all users
|   |   --> Precompute timelines into Redis on every post
|   |-- 1M - 100M users
|   |   |-- Any user with >500K followers?
|   |   |   |-- YES --> Hybrid: fan-out-on-write for normal, fan-out-on-read for celebrities
|   |   |   |-- NO --> Pure fan-out-on-write with Redis Cluster
|   |   |-- Feed type?
|   |   |   |-- Chronological --> Sort by timestamp in Redis sorted set
|   |   |   |-- Ranked --> Add Ranking Service with ML model
|   |   |   |-- Hybrid --> Chronological base + lightweight re-ranking
|   |-- >100M users
|       --> Full hybrid fan-out + ML ranking pipeline
|       --> Multi-region deployment with geo-sharding
|       --> Edge caching for feed responses

Step-by-Step Guide

1. Define the data model and API contracts

Design the core entities: Users, Posts, Follows, Likes, Comments. Define REST or gRPC endpoints for create-post, get-timeline, follow/unfollow. [src1]

CREATE TABLE users (
    user_id    BIGINT PRIMARY KEY,
    username   VARCHAR(64) UNIQUE NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE posts (
    post_id    BIGINT PRIMARY KEY,
    author_id  BIGINT NOT NULL REFERENCES users(user_id),
    content    TEXT NOT NULL,
    media_url  TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    like_count INT DEFAULT 0
);

CREATE TABLE follows (
    follower_id BIGINT NOT NULL,
    followee_id BIGINT NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id)
);

Verify: SELECT COUNT(*) FROM information_schema.tables WHERE table_name IN ('users','posts','follows'); --> expected: 3

2. Build the post ingestion pipeline

When a user creates a post, write it to the Posts table, then publish an event to Kafka for async fan-out. [src2]

producer = KafkaProducer(bootstrap_servers=["kafka:9092"])

def create_post(author_id, content, media_url=None):
    post = db.insert_post(author_id, content, media_url)
    producer.send("post-created",
        key=str(author_id).encode(),
        value=json.dumps({"post_id": post["post_id"], "author_id": author_id}).encode()
    )
    return post

Verify: kafka-console-consumer --topic post-created --from-beginning --> expected: JSON with post_id

3. Implement the fan-out service

Kafka consumers read post-created events, look up followers, and push post_id into each follower's Redis timeline. Skip fan-out for celebrities. [src3]

CELEBRITY_THRESHOLD = 500_000

for message in consumer:
    event = message.value
    followers = get_followers(event["author_id"])
    if len(followers) > CELEBRITY_THRESHOLD:
        continue  # handled at read time
    pipe = redis.pipeline(transaction=False)
    for fid in followers:
        pipe.lpush(f"user:{fid}:timeline", event["post_id"])
        pipe.ltrim(f"user:{fid}:timeline", 0, 799)
    pipe.execute()

Verify: redis-cli LRANGE user:42:timeline 0 9 --> expected: list with new post_id

4. Build the timeline read path

Read precomputed Redis list, hydrate post objects, merge celebrity tweets at serve time. [src2] [src3]

def get_home_timeline(user_id, cursor=0, limit=20):
    post_ids = redis.lrange(f"user:{user_id}:timeline", cursor, cursor + limit - 1)
    for celeb_id in get_followed_celebrities(user_id):
        celeb_posts = db.query_recent_posts(celeb_id, limit)
        post_ids.extend(celeb_posts)
    posts = hydrate_posts(list(dict.fromkeys(post_ids)))
    posts.sort(key=lambda p: p["created_at"], reverse=True)
    return posts[:limit]

Verify: curl /api/v1/timeline?user_id=42&limit=10 --> expected: JSON array of posts

5. Add feed ranking

Insert a ranking layer scoring posts by engagement probability using a lightweight ML model. [src4] [src5]

def rank_feed(user_id, candidates):
    features = extract_features(user_id, candidates)
    scores = ranking_model.predict(features)
    for post, score in zip(candidates, scores):
        post["rank_score"] = float(score)
    candidates.sort(key=lambda p: p["rank_score"], reverse=True)
    return candidates

Verify: Compare ranked vs chronological output -- ranked feed should prioritize higher-engagement posts

6. Configure media storage and CDN

Set up pre-signed uploads to S3, configure CDN for delivery, cache feed API responses at the edge. [src1] [src6]

def generate_upload_url(user_id, filename):
    key = f"media/{user_id}/{filename}"
    return s3.generate_presigned_url("put_object",
        Params={"Bucket": "feed-media", "Key": key},
        ExpiresIn=3600
    )

Verify: Upload test image via pre-signed URL, fetch via CDN --> expected: 200 OK

Code Examples

Python: Feed Generation with Hybrid Fan-Out

# Input:  user_id (int), cursor (int), limit (int)
# Output: list of post dicts sorted by rank score

import redis

r = redis.Redis(host="redis-cluster", port=6379, decode_responses=True)

def generate_feed(user_id, cursor=0, limit=20):
    """Hybrid feed: precomputed timeline + celebrity merge + ranking."""
    tl_key = f"user:{user_id}:timeline"
    post_ids = r.lrange(tl_key, cursor, cursor + limit * 2)
    celeb_followees = get_followed_celebrities(user_id)
    for cid in celeb_followees:
        celeb_posts = r.lrange(f"user:{cid}:posts", 0, limit - 1)
        post_ids.extend(celeb_posts)
    post_ids = list(dict.fromkeys(post_ids))
    posts = batch_hydrate(post_ids)
    ranked = rank_feed(user_id, posts)
    return ranked[cursor:cursor + limit]

Python: Redis Timeline Caching

# Input:  author_id (int), post_id (int), follower_ids (list[int])
# Output: None (updates Redis timelines)

import redis

r = redis.Redis(host="redis-cluster", port=6379, decode_responses=True)
MAX_TIMELINE_SIZE = 800
TIMELINE_TTL = 7 * 24 * 3600  # 7 days

def fan_out_to_timelines(author_id, post_id, follower_ids):
    pipe = r.pipeline(transaction=False)
    for fid in follower_ids:
        key = f"user:{fid}:timeline"
        pipe.lpush(key, post_id)
        pipe.ltrim(key, 0, MAX_TIMELINE_SIZE - 1)
        pipe.expire(key, TIMELINE_TTL)
    pipe.execute()

Anti-Patterns

Wrong: Fan-out-on-write for all users regardless of follower count

# BAD -- celebrity with 10M followers causes 10M Redis writes per tweet
def fan_out(author_id, post_id):
    followers = get_all_followers(author_id)  # could be 10M+
    for fid in followers:
        redis.lpush(f"user:{fid}:timeline", post_id)
    # Result: single tweet takes minutes, OOM risk

Correct: Hybrid fan-out with celebrity threshold

# GOOD -- skip fan-out for high-follower accounts, merge at read time
CELEBRITY_THRESHOLD = 500_000

def fan_out(author_id, post_id):
    followers = get_all_followers(author_id)
    if len(followers) > CELEBRITY_THRESHOLD:
        return  # handled at read time
    pipe = redis.pipeline(transaction=False)
    for fid in followers:
        pipe.lpush(f"user:{fid}:timeline", post_id)
        pipe.ltrim(f"user:{fid}:timeline", 0, 799)
    pipe.execute()

Wrong: Storing media files in the database

# BAD -- blobs in PostgreSQL cause table bloat and slow queries
def save_post(content, image_bytes):
    db.execute("INSERT INTO posts (content, image_data) VALUES (%s, %s)",
        (content, image_bytes))  # 5MB image in a row!

Correct: Store media in object storage, reference by URL

# GOOD -- database stores only the URL reference
def save_post(content, image_file):
    key = f"media/{uuid4()}.jpg"
    s3.upload_fileobj(image_file, "feed-media", key)
    cdn_url = f"https://cdn.example.com/{key}"
    db.execute("INSERT INTO posts (content, media_url) VALUES (%s, %s)",
        (content, cdn_url))

Wrong: Unbounded timeline cache with no eviction

# BAD -- Redis memory grows without bound
def add_to_timeline(user_id, post_id):
    redis.lpush(f"user:{user_id}:timeline", post_id)
    # No LTRIM, no TTL -- grows forever, exhausts Redis memory

Correct: Capped timeline with TTL and eviction

# GOOD -- bounded memory usage per user
def add_to_timeline(user_id, post_id):
    key = f"user:{user_id}:timeline"
    pipe = redis.pipeline()
    pipe.lpush(key, post_id)
    pipe.ltrim(key, 0, 799)
    pipe.expire(key, 7 * 24 * 3600)
    pipe.execute()

Wrong: Synchronous fan-out blocking the post API

# BAD -- user waits for fan-out to complete
@app.post("/api/posts")
def create_post(request):
    post = db.insert_post(request.data)
    for fid in get_followers(post.author_id):  # blocks for seconds
        redis.lpush(f"user:{fid}:timeline", post.id)
    return {"status": "ok"}  # user waited 5+ seconds

Correct: Async fan-out via message queue

# GOOD -- return immediately, fan-out asynchronously
@app.post("/api/posts")
def create_post(request):
    post = db.insert_post(request.data)
    kafka.send("post-created", {"post_id": post.id, "author_id": post.author_id})
    return {"status": "ok"}  # instant response

Common Pitfalls

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Building a follow-graph-based feed (Twitter, Instagram, LinkedIn)Content is purely algorithmic with no follow graph (TikTok For You)Recommendation engine with collaborative filtering
DAU > 10K and feed latency matters (<200ms)Small team project or MVP with <1K usersSimple SQL query with pagination
Posts are short-lived and time-sensitive (news, social updates)Content is long-form and rarely updated (blog, wiki)CMS with traditional caching
Real-time or near-real-time delivery is requiredBatch-processed feeds updated hourly are acceptableBatch ETL pipeline with scheduled materialization
Multiple content types (text, images, video, polls)Single content type with simple orderingStandard list API with sorting

Important Caveats

Related Units