design Netflix system architecture

- Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.

build a video streaming service like YouTube

- Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.

video on demand platform system design

- Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.

Netflix clone architecture guide

- Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.

YouTube clone system design

- Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.

Video Streaming Platform System Design

How do I design a video streaming platform (Netflix/YouTube clone)?

TL;DR

Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.
Key tool/command: ffmpeg -i input.mp4 -codec:v libx264 -preset medium -b:v 3000k -hls_time 6 -hls_playlist_type vod output.m3u8
Watch out for: Skipping the encoding ladder and serving a single bitrate — this causes buffering on slow connections and wastes bandwidth on mobile.
Works with: Any cloud provider (AWS MediaConvert + CloudFront, GCP Transcoder + Cloud CDN, Azure Media Services), or self-hosted with FFmpeg + nginx + your own CDN.

Constraints

Never serve video from origin servers directly — always use CDN/edge caching; a single 4K stream at 25 Mbps serving 10K concurrent users = 250 Gbps of origin bandwidth [src1]
Always transcode into multiple renditions (encoding ladder) — Netflix uses 5-12 renditions per title, YouTube generates 6+ formats from 144p to 4K [src6]
DRM encryption (Widevine + FairPlay + PlayReady) is mandatory for licensed content — unencrypted streams violate licensing agreements and enable piracy [src5]
Segment duration for HLS/DASH must be 2-10 seconds — Apple recommends 6s for HLS; shorter creates manifest bloat, longer causes slow adaptation [src7]
Video metadata (titles, thumbnails, user data) and video blobs must live in separate storage systems — SQL/NoSQL for metadata, object storage (S3/GCS) for blobs

Quick Reference

Component	Role	Technology Options	Scaling Strategy
Upload Service	Chunked upload, resumable uploads, deduplication	tus protocol, S3 multipart upload, GCS resumable	Horizontal — stateless workers behind load balancer
Message Queue	Decouples upload from transcoding, ensures at-least-once processing	Apache Kafka, AWS SQS, Google Pub/Sub, RabbitMQ	Partition by video ID; consumer groups per pipeline stage
Transcoding Pipeline	Encodes raw video into multiple bitrate/resolution renditions	FFmpeg, AWS MediaConvert, GCP Transcoder API, Handbrake	Horizontal — spin up workers per job; GPU instances for H.265/AV1
Object Storage	Stores raw uploads and transcoded segments/manifests	AWS S3, Google Cloud Storage, Azure Blob, MinIO	Virtually unlimited; lifecycle policies for raw cleanup
CDN / Edge Cache	Serves video segments from edge PoPs closest to viewers	Netflix Open Connect, CloudFront, Cloudflare, Akamai, Fastly	95%+ cache hit ratio; fill from origin on miss
Metadata Service	Video catalog, search, recommendations, user profiles	PostgreSQL + Elasticsearch, DynamoDB, Cassandra	Read replicas + caching (Redis); shard by user/region
Thumbnail Service	Generates preview thumbnails and trick-play images	FFmpeg scene detection, sprite sheets	Batch processing during transcode; cached on CDN
DRM License Server	Issues decryption keys for encrypted content	Widevine, FairPlay, PlayReady, BuyDRM, PallyCon	Multi-DRM per device; license server scales horizontally
Adaptive Streaming	Client-side bitrate adaptation based on bandwidth	HLS (Apple), DASH (MPEG), CMAF (unified)	Client-driven — player switches renditions automatically
API Gateway	Routes requests, rate limiting, auth, request aggregation	Kong, AWS API Gateway, nginx, Envoy	Horizontal behind L7 load balancer; edge-deployed
Recommendation Engine	Personalized content suggestions	Collaborative filtering, deep learning, two-phase ranking	Offline batch training + real-time feature store (Redis)
Analytics Pipeline	Playback quality metrics, engagement, A/B testing	Kafka + Spark/Flink, ClickHouse, BigQuery	Stream processing for real-time; batch for daily aggregates

Decision Tree

START
|-- Expected concurrent viewers?
|   |-- <1K users
|   |   --> Simple stack: S3 + CloudFront + FFmpeg on EC2/Lambda
|   |-- 1K-100K users
|   |   --> Standard architecture: message queue + worker pool transcoding
|   |   --> Managed CDN (CloudFront/Cloudflare), multi-AZ deployment
|   |-- 100K-1M users
|   |   --> Full microservices: dedicated services per component
|   |   --> Multi-region deployment, custom encoding ladder per title
|   |-- >1M users (Netflix/YouTube scale)
|       --> Custom CDN (Open Connect model): ISP-embedded cache appliances
|       --> Per-title encoding optimization, global service mesh
|
|-- Content type?
|   |-- User-generated (YouTube-style)
|   |   --> Heavy moderation pipeline (ML content scanning)
|   |   --> Aggressive transcode queue, no DRM for most content
|   |-- Licensed/premium (Netflix-style)
|   |   --> DRM mandatory (Widevine + FairPlay + PlayReady)
|   |   --> Higher encoding quality, geo-restrictions
|   |-- Both
|       --> Separate pipelines: fast for UGC, quality-optimized for premium
|
|-- Build vs buy?
    |-- Managed services --> AWS MediaConvert + CloudFront + DynamoDB
    |-- Build from scratch --> FFmpeg workers + Kafka + S3 + custom CDN
    |-- Hybrid (recommended) --> Custom upload/metadata + managed transcode/CDN

Step-by-Step Guide

1. Design the upload and ingest pipeline

Handle large file uploads (often multi-GB) with chunked, resumable uploads. Upload directly to object storage to avoid proxying through your application servers. [src2]

# Upload service — accepts chunked uploads, stores in S3, publishes job to Kafka
import boto3
from kafka import KafkaProducer
import json, uuid

s3 = boto3.client("s3", region_name="us-east-1")
producer = KafkaProducer(
    bootstrap_servers=["kafka:9092"],
    value_serializer=lambda v: json.dumps(v).encode("utf-8"),
)

def handle_upload_complete(file_path, user_id, title):
    video_id = str(uuid.uuid4())
    s3_key = f"raw/{video_id}/{file_path.split('/')[-1]}"
    s3.upload_file(file_path, "raw-video-uploads", s3_key)
    producer.send("transcode-jobs", value={
        "video_id": video_id, "s3_key": s3_key,
        "renditions": ["360p", "480p", "720p", "1080p"],
    })
    return video_id

Verify: aws s3 ls s3://raw-video-uploads/raw/{video_id}/ → raw file listed

2. Build the transcoding pipeline

Consume jobs from the message queue and transcode each video into multiple renditions using FFmpeg. Each rendition produces HLS segments (.ts files) and a playlist (.m3u8). [src3] [src7]

# FFmpeg command to transcode to 720p HLS with 6-second segments
ffmpeg -i input.mp4 \
  -vf scale=1280:720 \
  -c:v libx264 -preset medium -b:v 2800k \
  -c:a aac -b:a 128k \
  -hls_time 6 -hls_playlist_type vod \
  -hls_segment_filename "720p/seg_%04d.ts" \
  720p/playlist.m3u8

Verify: ffprobe 720p/playlist.m3u8 → shows stream info with correct resolution

3. Generate the master HLS manifest

Create a master playlist that references all rendition playlists, enabling adaptive bitrate switching in the player. [src7]

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
360p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1400000,RESOLUTION=854x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2800000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8

Verify: ffplay master.m3u8 → plays video with adaptive quality switching

4. Configure CDN and edge caching

Place your CDN in front of object storage. Video segments are immutable and should be cached aggressively (1 year). Manifests need shorter TTLs (30-60s) to allow content updates. [src1]

# nginx edge config: aggressive segment caching, short manifest TTL
location ~* \.ts$ {
    proxy_cache video_cache;
    proxy_cache_valid 200 365d;
    add_header Cache-Control "public, max-age=31536000, immutable";
}
location ~* \.m3u8$ {
    proxy_cache video_cache;
    proxy_cache_valid 200 60s;
    add_header Cache-Control "public, max-age=60";
}

Verify: curl -sI https://cdn.example.com/{video_id}/720p/seg_0001.ts | grep X-Cache → HIT on second request

5. Implement the playback API and metadata service

Build the API that serves video metadata and playback URLs. Use signed URLs so only authenticated users can access content. [src2] [src6]

// Playback API — returns signed CDN URL for the master manifest
app.get("/api/v1/videos/:videoId/playback", async (req, res) => {
  const video = await db.query("SELECT * FROM videos WHERE id = $1 AND status = 'ready'", [req.params.videoId]);
  if (!video.rows.length) return res.status(404).json({ error: "Not found" });
  const manifestUrl = getSignedUrl({
    url: `https://cdn.example.com/${req.params.videoId}/master.m3u8`,
    dateLessThan: new Date(Date.now() + 4 * 3600 * 1000).toISOString(),
  });
  res.json({ id: video.rows[0].id, manifest_url: manifestUrl });
});

Verify: curl http://localhost:3000/api/v1/videos/{id}/playback → JSON with signed manifest_url

6. Add DRM encryption for premium content

For licensed content, encrypt segments during transcoding and set up a multi-DRM license server (Widevine for Chrome/Android, FairPlay for Safari/iOS, PlayReady for Edge/Windows). [src5]

# Encrypt HLS segments with AES-128 (basic; use Widevine/FairPlay for production)
openssl rand 16 > enc.key
echo "https://keys.example.com/${VIDEO_ID}/enc.key" > enc_keyinfo.txt
echo "enc.key" >> enc_keyinfo.txt
ffmpeg -i input.mp4 -c:v libx264 -b:v 3000k -c:a aac -b:a 128k \
  -hls_time 6 -hls_playlist_type vod \
  -hls_key_info_file enc_keyinfo.txt output_encrypted.m3u8

Verify: grep EXT-X-KEY output_encrypted.m3u8 → shows METHOD=AES-128,URI=...

Code Examples

Python: Video Upload Pipeline with S3 and Kafka

# Input:  Raw video file path, user metadata
# Output: video_id, transcode job published to Kafka

import boto3, json, uuid, hashlib
from kafka import KafkaProducer

class VideoUploadPipeline:
    def __init__(self, s3_bucket, kafka_brokers):
        self.s3 = boto3.client("s3")
        self.bucket = s3_bucket
        self.producer = KafkaProducer(
            bootstrap_servers=kafka_brokers,
            value_serializer=lambda v: json.dumps(v).encode(),
        )

    def upload(self, filepath, user_id, title):
        video_id = str(uuid.uuid4())
        s3_key = f"raw/{video_id}/{filepath.rsplit('/', 1)[-1]}"
        self.s3.upload_file(filepath, self.bucket, s3_key)
        job = {"video_id": video_id, "s3_key": s3_key,
               "renditions": ["360p", "480p", "720p", "1080p"]}
        self.producer.send("transcode-jobs", value=job)
        self.producer.flush()
        return job

JavaScript: HLS Master Manifest Generator

// Input:  video ID, list of completed renditions with metadata
// Output: M3U8 master manifest string

function generateMasterManifest(videoId, renditions) {
  const lines = ["#EXTM3U", "#EXT-X-VERSION:3", ""];
  const sorted = [...renditions].sort((a, b) => a.bandwidth - b.bandwidth);
  for (const r of sorted) {
    lines.push(`#EXT-X-STREAM-INF:BANDWIDTH=${r.bandwidth},RESOLUTION=${r.resolution},CODECS="${r.codecs}"`);
    lines.push(`${r.name}/playlist.m3u8`);
    lines.push("");
  }
  return lines.join("\n");
}

Go: Transcoding Worker with Job Queue

// Input:  Transcode job from Kafka (video_id, s3_key, renditions)
// Output: Transcoded HLS segments uploaded to S3

var ladder = map[string]Rendition{
    "360p":  {640, 360, "800k", "96k"},
    "720p":  {1280, 720, "2800k", "128k"},
    "1080p": {1920, 1080, "5000k", "192k"},
}

func transcodeRendition(inputPath, videoID, name string, r Rendition) error {
    outDir := filepath.Join("/tmp/transcode", videoID, name)
    os.MkdirAll(outDir, 0755)
    cmd := exec.Command("ffmpeg", "-i", inputPath,
        "-vf", fmt.Sprintf("scale=%d:%d", r.Width, r.Height),
        "-c:v", "libx264", "-preset", "medium", "-b:v", r.VideoBitrate,
        "-c:a", "aac", "-b:a", r.AudioBitrate,
        "-hls_time", "6", "-hls_playlist_type", "vod",
        "-hls_segment_filename", filepath.Join(outDir, "seg_%04d.ts"),
        filepath.Join(outDir, "playlist.m3u8"), "-y")
    return cmd.Run()
}

Anti-Patterns

Wrong: Synchronous transcoding during upload

# BAD — blocks the upload response until transcoding completes (minutes to hours)
@app.post("/upload")
def upload_video(file):
    save_to_disk(file)
    transcode_all_renditions(file.filename)    # Blocks for 5-30 minutes!
    return {"status": "ready"}                 # User waits forever, request times out

Correct: Async transcoding via message queue

# GOOD — upload returns immediately, transcoding happens asynchronously
@app.post("/upload")
def upload_video(file):
    video_id = save_to_s3(file)
    publish_to_queue("transcode-jobs", {"video_id": video_id})
    return {"status": "processing", "video_id": video_id}   # Returns in <1s

Wrong: Single bitrate streaming

# BAD — serves one resolution regardless of user bandwidth
def get_video_url(video_id):
    return f"https://cdn.example.com/{video_id}/video_1080p.mp4"
    # Mobile users on 3G get constant buffering

Correct: Adaptive bitrate with HLS/DASH encoding ladder

# GOOD — serves master manifest; player picks the right rendition
def get_video_url(video_id):
    return f"https://cdn.example.com/{video_id}/master.m3u8"
    # Player adapts quality based on available bandwidth

Wrong: Storing video blobs in a database

# BAD — stores video binary in PostgreSQL BYTEA column
def store_video(video_id, video_bytes):
    db.execute("INSERT INTO videos (id, data) VALUES (%s, %s)", (video_id, video_bytes))
    # Database bloats to TB, backups take hours, queries slow to a crawl

Correct: Object storage for blobs, database for metadata only

# GOOD — S3 for video files, PostgreSQL for metadata only
def store_video(video_id, filepath, title, user_id):
    s3.upload_file(filepath, "video-bucket", f"raw/{video_id}/{os.path.basename(filepath)}")
    db.execute("INSERT INTO videos (id, title, user_id, status) VALUES (%s,%s,%s,'uploaded')",
               (video_id, title, user_id))

Wrong: No CDN — serving directly from origin

# BAD — all viewers worldwide hit the single origin region
def get_stream_url(video_id):
    return f"https://us-east-1.s3.amazonaws.com/videos/{video_id}/master.m3u8"
    # 10K viewers x 5 Mbps = 50 Gbps origin bandwidth, $22,500/hour in S3 egress

Correct: CDN edge delivery with origin shielding

# GOOD — CDN serves from 200+ edge PoPs, origin handles <5% of requests
def get_stream_url(video_id):
    return f"https://cdn.example.com/{video_id}/master.m3u8"
    # 95%+ cache hit ratio, $0.01-0.02/GB vs $0.09/GB from S3 direct

Common Pitfalls

Not implementing resumable uploads: Users uploading multi-GB files lose all progress on network interruption. Fix: Use the tus protocol or S3 multipart upload with client-side retry per chunk. [src2]
Fixed encoding ladder ignoring content complexity: An animated show looks sharp at 1.5 Mbps, but action movies need 4+ Mbps at the same resolution. Fix: Implement per-title encoding optimization (Netflix reported 20-30% savings). [src3]
Ignoring cold start latency in playback: First segment takes too long to load because the player starts at high quality. Fix: Configure the player to start at a low rendition and ramp up. [src7]
Missing video processing status tracking: Users see "processing" forever with no progress indication. Fix: Emit progress events from transcoding workers and expose via WebSocket or polling. [src6]
No origin shielding on the CDN: Every edge PoP cache miss hits origin directly, creating thundering herd. Fix: Add a shield/mid-tier cache layer between edge and origin. [src1]
Skipping thumbnail generation: Users cannot preview content, reducing engagement. Fix: Generate thumbnails at key intervals during transcoding using FFmpeg scene detection. [src6]
Long cache TTLs on manifests: Cannot update or remove content (DMCA, corrections) when manifests are cached for days. Fix: Short TTLs (30-60s) on .m3u8 manifests, aggressive caching on immutable .ts segments. [src7]
No geographic routing: All traffic routes to a single region, adding 100-300ms latency for distant users. Fix: Multi-region deployment with DNS-based latency routing. [src4]

Diagnostic Commands

# Validate HLS manifest structure
ffprobe -v quiet -print_format json -show_streams master.m3u8

# Check video codec, bitrate, and resolution
ffprobe -v error -show_entries stream=codec_name,width,height,bit_rate -of csv input.mp4

# Measure CDN cache hit ratio
curl -sI https://cdn.example.com/video_id/720p/seg_0001.ts | grep -i x-cache

# Monitor Kafka transcoding queue depth
kafka-consumer-groups --bootstrap-server kafka:9092 --group transcode-workers --describe

# Check S3 bucket size for cost estimation
aws s3 ls s3://transcoded-video-segments --recursive --summarize | tail -2

# Test HLS playback
ffplay https://cdn.example.com/video_id/master.m3u8

# Validate DRM encryption on segments
ffprobe -v error -show_entries format_tags -of json encrypted_segment.ts

Version History & Compatibility

Technology	Status	Notes
H.264/AVC + AAC	Universal standard	Maximum device compatibility; baseline for all platforms
H.265/HEVC	Widely supported (2020+)	30-40% smaller files; Apple ecosystem strong; licensing fees
AV1	Growing adoption (2023+)	50-60% smaller than H.264; royalty-free; encoding 50-100x slower
HLS	Industry standard	Apple-developed; works everywhere with wide player support
DASH	Industry standard	MPEG-developed; primary for non-Apple; not native on iOS
CMAF	Emerging standard	Unifies HLS + DASH segments; reduces storage by 50%
Widevine L1/L3	Current DRM	Google-backed; Chrome, Android, smart TVs, Chromecast
FairPlay Streaming	Current DRM	Apple-only; required for Safari and iOS DRM playback
PlayReady	Current DRM	Microsoft-backed; Edge, Xbox, Windows, some smart TVs

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Building a video-on-demand (VOD) platform at any scale	Building real-time video conferencing (Zoom/Teams)	WebRTC-based SFU/MCU architecture
Content is pre-recorded and can be transcoded ahead of time	Ultra-low latency live streaming required (<3s)	LL-HLS, WebRTC, or RTMP-based live streaming
Adaptive bitrate playback across diverse devices and networks	Streaming short clips (<30s) where quality switching has no time to help	Progressive download (single MP4 file)
Licensed content requires DRM protection	Internal/private video sharing within a small team	Simple S3 presigned URLs or Cloudflare Stream
>10K concurrent viewers expected	Hosting a few dozen training videos for a small org	Managed platforms (Vimeo, Wistia, YouTube unlisted)

Important Caveats

AV1 encoding is extremely CPU-intensive (50-100x slower than H.264) — use hardware encoders (NVIDIA NVENC, Intel QSV) or cloud encoding services for production; CPU-only AV1 is only viable for batch processing with long deadlines
CDN egress costs dominate the bill at scale — Netflix spends ~$0.01/GB via Open Connect peering vs $0.05-0.09/GB on commercial CDNs; at 1PB/day this difference is millions per month
DRM adds significant implementation complexity and ongoing licensing costs — evaluate whether your content actually needs DRM before committing; user-generated content platforms typically do not need DRM
Per-title encoding optimization delivers 20-30% bandwidth savings but requires a sophisticated analysis pipeline — start with a fixed ladder and optimize later
Video transcoding is one of the most compute-intensive workloads in software — a 1-hour 4K video can take 4-8 hours to encode on a single CPU instance across all renditions