Video Streaming Platform System Design
How do I design a video streaming platform (Netflix/YouTube clone)?
TL;DR
- Bottom line: A video streaming platform requires five core subsystems: upload/ingest, transcoding pipeline, object storage, CDN edge delivery, and adaptive bitrate playback (HLS/DASH) — the CDN and transcoding pipeline are where most complexity and cost reside.
- Key tool/command:
ffmpeg -i input.mp4 -codec:v libx264 -preset medium -b:v 3000k -hls_time 6 -hls_playlist_type vod output.m3u8 - Watch out for: Skipping the encoding ladder and serving a single bitrate — this causes buffering on slow connections and wastes bandwidth on mobile.
- Works with: Any cloud provider (AWS MediaConvert + CloudFront, GCP Transcoder + Cloud CDN, Azure Media Services), or self-hosted with FFmpeg + nginx + your own CDN.
Constraints
- Never serve video from origin servers directly — always use CDN/edge caching; a single 4K stream at 25 Mbps serving 10K concurrent users = 250 Gbps of origin bandwidth [src1]
- Always transcode into multiple renditions (encoding ladder) — Netflix uses 5-12 renditions per title, YouTube generates 6+ formats from 144p to 4K [src6]
- DRM encryption (Widevine + FairPlay + PlayReady) is mandatory for licensed content — unencrypted streams violate licensing agreements and enable piracy [src5]
- Segment duration for HLS/DASH must be 2-10 seconds — Apple recommends 6s for HLS; shorter creates manifest bloat, longer causes slow adaptation [src7]
- Video metadata (titles, thumbnails, user data) and video blobs must live in separate storage systems — SQL/NoSQL for metadata, object storage (S3/GCS) for blobs
Quick Reference
| Component | Role | Technology Options | Scaling Strategy |
|---|---|---|---|
| Upload Service | Chunked upload, resumable uploads, deduplication | tus protocol, S3 multipart upload, GCS resumable | Horizontal — stateless workers behind load balancer |
| Message Queue | Decouples upload from transcoding, ensures at-least-once processing | Apache Kafka, AWS SQS, Google Pub/Sub, RabbitMQ | Partition by video ID; consumer groups per pipeline stage |
| Transcoding Pipeline | Encodes raw video into multiple bitrate/resolution renditions | FFmpeg, AWS MediaConvert, GCP Transcoder API, Handbrake | Horizontal — spin up workers per job; GPU instances for H.265/AV1 |
| Object Storage | Stores raw uploads and transcoded segments/manifests | AWS S3, Google Cloud Storage, Azure Blob, MinIO | Virtually unlimited; lifecycle policies for raw cleanup |
| CDN / Edge Cache | Serves video segments from edge PoPs closest to viewers | Netflix Open Connect, CloudFront, Cloudflare, Akamai, Fastly | 95%+ cache hit ratio; fill from origin on miss |
| Metadata Service | Video catalog, search, recommendations, user profiles | PostgreSQL + Elasticsearch, DynamoDB, Cassandra | Read replicas + caching (Redis); shard by user/region |
| Thumbnail Service | Generates preview thumbnails and trick-play images | FFmpeg scene detection, sprite sheets | Batch processing during transcode; cached on CDN |
| DRM License Server | Issues decryption keys for encrypted content | Widevine, FairPlay, PlayReady, BuyDRM, PallyCon | Multi-DRM per device; license server scales horizontally |
| Adaptive Streaming | Client-side bitrate adaptation based on bandwidth | HLS (Apple), DASH (MPEG), CMAF (unified) | Client-driven — player switches renditions automatically |
| API Gateway | Routes requests, rate limiting, auth, request aggregation | Kong, AWS API Gateway, nginx, Envoy | Horizontal behind L7 load balancer; edge-deployed |
| Recommendation Engine | Personalized content suggestions | Collaborative filtering, deep learning, two-phase ranking | Offline batch training + real-time feature store (Redis) |
| Analytics Pipeline | Playback quality metrics, engagement, A/B testing | Kafka + Spark/Flink, ClickHouse, BigQuery | Stream processing for real-time; batch for daily aggregates |
Decision Tree
START
|-- Expected concurrent viewers?
| |-- <1K users
| | --> Simple stack: S3 + CloudFront + FFmpeg on EC2/Lambda
| |-- 1K-100K users
| | --> Standard architecture: message queue + worker pool transcoding
| | --> Managed CDN (CloudFront/Cloudflare), multi-AZ deployment
| |-- 100K-1M users
| | --> Full microservices: dedicated services per component
| | --> Multi-region deployment, custom encoding ladder per title
| |-- >1M users (Netflix/YouTube scale)
| --> Custom CDN (Open Connect model): ISP-embedded cache appliances
| --> Per-title encoding optimization, global service mesh
|
|-- Content type?
| |-- User-generated (YouTube-style)
| | --> Heavy moderation pipeline (ML content scanning)
| | --> Aggressive transcode queue, no DRM for most content
| |-- Licensed/premium (Netflix-style)
| | --> DRM mandatory (Widevine + FairPlay + PlayReady)
| | --> Higher encoding quality, geo-restrictions
| |-- Both
| --> Separate pipelines: fast for UGC, quality-optimized for premium
|
|-- Build vs buy?
|-- Managed services --> AWS MediaConvert + CloudFront + DynamoDB
|-- Build from scratch --> FFmpeg workers + Kafka + S3 + custom CDN
|-- Hybrid (recommended) --> Custom upload/metadata + managed transcode/CDN
Step-by-Step Guide
1. Design the upload and ingest pipeline
Handle large file uploads (often multi-GB) with chunked, resumable uploads. Upload directly to object storage to avoid proxying through your application servers. [src2]
# Upload service — accepts chunked uploads, stores in S3, publishes job to Kafka
import boto3
from kafka import KafkaProducer
import json, uuid
s3 = boto3.client("s3", region_name="us-east-1")
producer = KafkaProducer(
bootstrap_servers=["kafka:9092"],
value_serializer=lambda v: json.dumps(v).encode("utf-8"),
)
def handle_upload_complete(file_path, user_id, title):
video_id = str(uuid.uuid4())
s3_key = f"raw/{video_id}/{file_path.split('/')[-1]}"
s3.upload_file(file_path, "raw-video-uploads", s3_key)
producer.send("transcode-jobs", value={
"video_id": video_id, "s3_key": s3_key,
"renditions": ["360p", "480p", "720p", "1080p"],
})
return video_id
Verify: aws s3 ls s3://raw-video-uploads/raw/{video_id}/ → raw file listed
2. Build the transcoding pipeline
Consume jobs from the message queue and transcode each video into multiple renditions using FFmpeg. Each rendition produces HLS segments (.ts files) and a playlist (.m3u8). [src3] [src7]
# FFmpeg command to transcode to 720p HLS with 6-second segments
ffmpeg -i input.mp4 \
-vf scale=1280:720 \
-c:v libx264 -preset medium -b:v 2800k \
-c:a aac -b:a 128k \
-hls_time 6 -hls_playlist_type vod \
-hls_segment_filename "720p/seg_%04d.ts" \
720p/playlist.m3u8
Verify: ffprobe 720p/playlist.m3u8 → shows stream info with correct resolution
3. Generate the master HLS manifest
Create a master playlist that references all rendition playlists, enabling adaptive bitrate switching in the player. [src7]
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
360p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1400000,RESOLUTION=854x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2800000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8
Verify: ffplay master.m3u8 → plays video with adaptive quality switching
4. Configure CDN and edge caching
Place your CDN in front of object storage. Video segments are immutable and should be cached aggressively (1 year). Manifests need shorter TTLs (30-60s) to allow content updates. [src1]
# nginx edge config: aggressive segment caching, short manifest TTL
location ~* \.ts$ {
proxy_cache video_cache;
proxy_cache_valid 200 365d;
add_header Cache-Control "public, max-age=31536000, immutable";
}
location ~* \.m3u8$ {
proxy_cache video_cache;
proxy_cache_valid 200 60s;
add_header Cache-Control "public, max-age=60";
}
Verify: curl -sI https://cdn.example.com/{video_id}/720p/seg_0001.ts | grep X-Cache → HIT on second request
5. Implement the playback API and metadata service
Build the API that serves video metadata and playback URLs. Use signed URLs so only authenticated users can access content. [src2] [src6]
// Playback API — returns signed CDN URL for the master manifest
app.get("/api/v1/videos/:videoId/playback", async (req, res) => {
const video = await db.query("SELECT * FROM videos WHERE id = $1 AND status = 'ready'", [req.params.videoId]);
if (!video.rows.length) return res.status(404).json({ error: "Not found" });
const manifestUrl = getSignedUrl({
url: `https://cdn.example.com/${req.params.videoId}/master.m3u8`,
dateLessThan: new Date(Date.now() + 4 * 3600 * 1000).toISOString(),
});
res.json({ id: video.rows[0].id, manifest_url: manifestUrl });
});
Verify: curl http://localhost:3000/api/v1/videos/{id}/playback → JSON with signed manifest_url
6. Add DRM encryption for premium content
For licensed content, encrypt segments during transcoding and set up a multi-DRM license server (Widevine for Chrome/Android, FairPlay for Safari/iOS, PlayReady for Edge/Windows). [src5]
# Encrypt HLS segments with AES-128 (basic; use Widevine/FairPlay for production)
openssl rand 16 > enc.key
echo "https://keys.example.com/${VIDEO_ID}/enc.key" > enc_keyinfo.txt
echo "enc.key" >> enc_keyinfo.txt
ffmpeg -i input.mp4 -c:v libx264 -b:v 3000k -c:a aac -b:a 128k \
-hls_time 6 -hls_playlist_type vod \
-hls_key_info_file enc_keyinfo.txt output_encrypted.m3u8
Verify: grep EXT-X-KEY output_encrypted.m3u8 → shows METHOD=AES-128,URI=...
Code Examples
Python: Video Upload Pipeline with S3 and Kafka
# Input: Raw video file path, user metadata
# Output: video_id, transcode job published to Kafka
import boto3, json, uuid, hashlib
from kafka import KafkaProducer
class VideoUploadPipeline:
def __init__(self, s3_bucket, kafka_brokers):
self.s3 = boto3.client("s3")
self.bucket = s3_bucket
self.producer = KafkaProducer(
bootstrap_servers=kafka_brokers,
value_serializer=lambda v: json.dumps(v).encode(),
)
def upload(self, filepath, user_id, title):
video_id = str(uuid.uuid4())
s3_key = f"raw/{video_id}/{filepath.rsplit('/', 1)[-1]}"
self.s3.upload_file(filepath, self.bucket, s3_key)
job = {"video_id": video_id, "s3_key": s3_key,
"renditions": ["360p", "480p", "720p", "1080p"]}
self.producer.send("transcode-jobs", value=job)
self.producer.flush()
return job
JavaScript: HLS Master Manifest Generator
// Input: video ID, list of completed renditions with metadata
// Output: M3U8 master manifest string
function generateMasterManifest(videoId, renditions) {
const lines = ["#EXTM3U", "#EXT-X-VERSION:3", ""];
const sorted = [...renditions].sort((a, b) => a.bandwidth - b.bandwidth);
for (const r of sorted) {
lines.push(`#EXT-X-STREAM-INF:BANDWIDTH=${r.bandwidth},RESOLUTION=${r.resolution},CODECS="${r.codecs}"`);
lines.push(`${r.name}/playlist.m3u8`);
lines.push("");
}
return lines.join("\n");
}
Go: Transcoding Worker with Job Queue
// Input: Transcode job from Kafka (video_id, s3_key, renditions)
// Output: Transcoded HLS segments uploaded to S3
var ladder = map[string]Rendition{
"360p": {640, 360, "800k", "96k"},
"720p": {1280, 720, "2800k", "128k"},
"1080p": {1920, 1080, "5000k", "192k"},
}
func transcodeRendition(inputPath, videoID, name string, r Rendition) error {
outDir := filepath.Join("/tmp/transcode", videoID, name)
os.MkdirAll(outDir, 0755)
cmd := exec.Command("ffmpeg", "-i", inputPath,
"-vf", fmt.Sprintf("scale=%d:%d", r.Width, r.Height),
"-c:v", "libx264", "-preset", "medium", "-b:v", r.VideoBitrate,
"-c:a", "aac", "-b:a", r.AudioBitrate,
"-hls_time", "6", "-hls_playlist_type", "vod",
"-hls_segment_filename", filepath.Join(outDir, "seg_%04d.ts"),
filepath.Join(outDir, "playlist.m3u8"), "-y")
return cmd.Run()
}
Anti-Patterns
Wrong: Synchronous transcoding during upload
# BAD — blocks the upload response until transcoding completes (minutes to hours)
@app.post("/upload")
def upload_video(file):
save_to_disk(file)
transcode_all_renditions(file.filename) # Blocks for 5-30 minutes!
return {"status": "ready"} # User waits forever, request times out
Correct: Async transcoding via message queue
# GOOD — upload returns immediately, transcoding happens asynchronously
@app.post("/upload")
def upload_video(file):
video_id = save_to_s3(file)
publish_to_queue("transcode-jobs", {"video_id": video_id})
return {"status": "processing", "video_id": video_id} # Returns in <1s
Wrong: Single bitrate streaming
# BAD — serves one resolution regardless of user bandwidth
def get_video_url(video_id):
return f"https://cdn.example.com/{video_id}/video_1080p.mp4"
# Mobile users on 3G get constant buffering
Correct: Adaptive bitrate with HLS/DASH encoding ladder
# GOOD — serves master manifest; player picks the right rendition
def get_video_url(video_id):
return f"https://cdn.example.com/{video_id}/master.m3u8"
# Player adapts quality based on available bandwidth
Wrong: Storing video blobs in a database
# BAD — stores video binary in PostgreSQL BYTEA column
def store_video(video_id, video_bytes):
db.execute("INSERT INTO videos (id, data) VALUES (%s, %s)", (video_id, video_bytes))
# Database bloats to TB, backups take hours, queries slow to a crawl
Correct: Object storage for blobs, database for metadata only
# GOOD — S3 for video files, PostgreSQL for metadata only
def store_video(video_id, filepath, title, user_id):
s3.upload_file(filepath, "video-bucket", f"raw/{video_id}/{os.path.basename(filepath)}")
db.execute("INSERT INTO videos (id, title, user_id, status) VALUES (%s,%s,%s,'uploaded')",
(video_id, title, user_id))
Wrong: No CDN — serving directly from origin
# BAD — all viewers worldwide hit the single origin region
def get_stream_url(video_id):
return f"https://us-east-1.s3.amazonaws.com/videos/{video_id}/master.m3u8"
# 10K viewers x 5 Mbps = 50 Gbps origin bandwidth, $22,500/hour in S3 egress
Correct: CDN edge delivery with origin shielding
# GOOD — CDN serves from 200+ edge PoPs, origin handles <5% of requests
def get_stream_url(video_id):
return f"https://cdn.example.com/{video_id}/master.m3u8"
# 95%+ cache hit ratio, $0.01-0.02/GB vs $0.09/GB from S3 direct
Common Pitfalls
- Not implementing resumable uploads: Users uploading multi-GB files lose all progress on network interruption. Fix: Use the
tusprotocol or S3 multipart upload with client-side retry per chunk. [src2] - Fixed encoding ladder ignoring content complexity: An animated show looks sharp at 1.5 Mbps, but action movies need 4+ Mbps at the same resolution. Fix: Implement per-title encoding optimization (Netflix reported 20-30% savings). [src3]
- Ignoring cold start latency in playback: First segment takes too long to load because the player starts at high quality. Fix: Configure the player to start at a low rendition and ramp up. [src7]
- Missing video processing status tracking: Users see "processing" forever with no progress indication. Fix: Emit progress events from transcoding workers and expose via WebSocket or polling. [src6]
- No origin shielding on the CDN: Every edge PoP cache miss hits origin directly, creating thundering herd. Fix: Add a shield/mid-tier cache layer between edge and origin. [src1]
- Skipping thumbnail generation: Users cannot preview content, reducing engagement. Fix: Generate thumbnails at key intervals during transcoding using FFmpeg scene detection. [src6]
- Long cache TTLs on manifests: Cannot update or remove content (DMCA, corrections) when manifests are cached for days. Fix: Short TTLs (30-60s) on
.m3u8manifests, aggressive caching on immutable.tssegments. [src7] - No geographic routing: All traffic routes to a single region, adding 100-300ms latency for distant users. Fix: Multi-region deployment with DNS-based latency routing. [src4]
Diagnostic Commands
# Validate HLS manifest structure
ffprobe -v quiet -print_format json -show_streams master.m3u8
# Check video codec, bitrate, and resolution
ffprobe -v error -show_entries stream=codec_name,width,height,bit_rate -of csv input.mp4
# Measure CDN cache hit ratio
curl -sI https://cdn.example.com/video_id/720p/seg_0001.ts | grep -i x-cache
# Monitor Kafka transcoding queue depth
kafka-consumer-groups --bootstrap-server kafka:9092 --group transcode-workers --describe
# Check S3 bucket size for cost estimation
aws s3 ls s3://transcoded-video-segments --recursive --summarize | tail -2
# Test HLS playback
ffplay https://cdn.example.com/video_id/master.m3u8
# Validate DRM encryption on segments
ffprobe -v error -show_entries format_tags -of json encrypted_segment.ts
Version History & Compatibility
| Technology | Status | Notes |
|---|---|---|
| H.264/AVC + AAC | Universal standard | Maximum device compatibility; baseline for all platforms |
| H.265/HEVC | Widely supported (2020+) | 30-40% smaller files; Apple ecosystem strong; licensing fees |
| AV1 | Growing adoption (2023+) | 50-60% smaller than H.264; royalty-free; encoding 50-100x slower |
| HLS | Industry standard | Apple-developed; works everywhere with wide player support |
| DASH | Industry standard | MPEG-developed; primary for non-Apple; not native on iOS |
| CMAF | Emerging standard | Unifies HLS + DASH segments; reduces storage by 50% |
| Widevine L1/L3 | Current DRM | Google-backed; Chrome, Android, smart TVs, Chromecast |
| FairPlay Streaming | Current DRM | Apple-only; required for Safari and iOS DRM playback |
| PlayReady | Current DRM | Microsoft-backed; Edge, Xbox, Windows, some smart TVs |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Building a video-on-demand (VOD) platform at any scale | Building real-time video conferencing (Zoom/Teams) | WebRTC-based SFU/MCU architecture |
| Content is pre-recorded and can be transcoded ahead of time | Ultra-low latency live streaming required (<3s) | LL-HLS, WebRTC, or RTMP-based live streaming |
| Adaptive bitrate playback across diverse devices and networks | Streaming short clips (<30s) where quality switching has no time to help | Progressive download (single MP4 file) |
| Licensed content requires DRM protection | Internal/private video sharing within a small team | Simple S3 presigned URLs or Cloudflare Stream |
| >10K concurrent viewers expected | Hosting a few dozen training videos for a small org | Managed platforms (Vimeo, Wistia, YouTube unlisted) |
Important Caveats
- AV1 encoding is extremely CPU-intensive (50-100x slower than H.264) — use hardware encoders (NVIDIA NVENC, Intel QSV) or cloud encoding services for production; CPU-only AV1 is only viable for batch processing with long deadlines
- CDN egress costs dominate the bill at scale — Netflix spends ~$0.01/GB via Open Connect peering vs $0.05-0.09/GB on commercial CDNs; at 1PB/day this difference is millions per month
- DRM adds significant implementation complexity and ongoing licensing costs — evaluate whether your content actually needs DRM before committing; user-generated content platforms typically do not need DRM
- Per-title encoding optimization delivers 20-30% bandwidth savings but requires a sophisticated analysis pipeline — start with a fixed ladder and optimize later
- Video transcoding is one of the most compute-intensive workloads in software — a 1-hour 4K video can take 4-8 hours to encode on a single CPU instance across all renditions