Video Streaming Platform System Design

Type: Software Reference Confidence: 0.91 Sources: 7 Verified: 2026-02-23 Freshness: quarterly

TL;DR

Constraints

Quick Reference

ComponentRoleTechnology OptionsScaling Strategy
Upload ServiceChunked upload, resumable uploads, deduplicationtus protocol, S3 multipart upload, GCS resumableHorizontal — stateless workers behind load balancer
Message QueueDecouples upload from transcoding, ensures at-least-once processingApache Kafka, AWS SQS, Google Pub/Sub, RabbitMQPartition by video ID; consumer groups per pipeline stage
Transcoding PipelineEncodes raw video into multiple bitrate/resolution renditionsFFmpeg, AWS MediaConvert, GCP Transcoder API, HandbrakeHorizontal — spin up workers per job; GPU instances for H.265/AV1
Object StorageStores raw uploads and transcoded segments/manifestsAWS S3, Google Cloud Storage, Azure Blob, MinIOVirtually unlimited; lifecycle policies for raw cleanup
CDN / Edge CacheServes video segments from edge PoPs closest to viewersNetflix Open Connect, CloudFront, Cloudflare, Akamai, Fastly95%+ cache hit ratio; fill from origin on miss
Metadata ServiceVideo catalog, search, recommendations, user profilesPostgreSQL + Elasticsearch, DynamoDB, CassandraRead replicas + caching (Redis); shard by user/region
Thumbnail ServiceGenerates preview thumbnails and trick-play imagesFFmpeg scene detection, sprite sheetsBatch processing during transcode; cached on CDN
DRM License ServerIssues decryption keys for encrypted contentWidevine, FairPlay, PlayReady, BuyDRM, PallyConMulti-DRM per device; license server scales horizontally
Adaptive StreamingClient-side bitrate adaptation based on bandwidthHLS (Apple), DASH (MPEG), CMAF (unified)Client-driven — player switches renditions automatically
API GatewayRoutes requests, rate limiting, auth, request aggregationKong, AWS API Gateway, nginx, EnvoyHorizontal behind L7 load balancer; edge-deployed
Recommendation EnginePersonalized content suggestionsCollaborative filtering, deep learning, two-phase rankingOffline batch training + real-time feature store (Redis)
Analytics PipelinePlayback quality metrics, engagement, A/B testingKafka + Spark/Flink, ClickHouse, BigQueryStream processing for real-time; batch for daily aggregates

Decision Tree

START
|-- Expected concurrent viewers?
|   |-- <1K users
|   |   --> Simple stack: S3 + CloudFront + FFmpeg on EC2/Lambda
|   |-- 1K-100K users
|   |   --> Standard architecture: message queue + worker pool transcoding
|   |   --> Managed CDN (CloudFront/Cloudflare), multi-AZ deployment
|   |-- 100K-1M users
|   |   --> Full microservices: dedicated services per component
|   |   --> Multi-region deployment, custom encoding ladder per title
|   |-- >1M users (Netflix/YouTube scale)
|       --> Custom CDN (Open Connect model): ISP-embedded cache appliances
|       --> Per-title encoding optimization, global service mesh
|
|-- Content type?
|   |-- User-generated (YouTube-style)
|   |   --> Heavy moderation pipeline (ML content scanning)
|   |   --> Aggressive transcode queue, no DRM for most content
|   |-- Licensed/premium (Netflix-style)
|   |   --> DRM mandatory (Widevine + FairPlay + PlayReady)
|   |   --> Higher encoding quality, geo-restrictions
|   |-- Both
|       --> Separate pipelines: fast for UGC, quality-optimized for premium
|
|-- Build vs buy?
    |-- Managed services --> AWS MediaConvert + CloudFront + DynamoDB
    |-- Build from scratch --> FFmpeg workers + Kafka + S3 + custom CDN
    |-- Hybrid (recommended) --> Custom upload/metadata + managed transcode/CDN

Step-by-Step Guide

1. Design the upload and ingest pipeline

Handle large file uploads (often multi-GB) with chunked, resumable uploads. Upload directly to object storage to avoid proxying through your application servers. [src2]

# Upload service — accepts chunked uploads, stores in S3, publishes job to Kafka
import boto3
from kafka import KafkaProducer
import json, uuid

s3 = boto3.client("s3", region_name="us-east-1")
producer = KafkaProducer(
    bootstrap_servers=["kafka:9092"],
    value_serializer=lambda v: json.dumps(v).encode("utf-8"),
)

def handle_upload_complete(file_path, user_id, title):
    video_id = str(uuid.uuid4())
    s3_key = f"raw/{video_id}/{file_path.split('/')[-1]}"
    s3.upload_file(file_path, "raw-video-uploads", s3_key)
    producer.send("transcode-jobs", value={
        "video_id": video_id, "s3_key": s3_key,
        "renditions": ["360p", "480p", "720p", "1080p"],
    })
    return video_id

Verify: aws s3 ls s3://raw-video-uploads/raw/{video_id}/ → raw file listed

2. Build the transcoding pipeline

Consume jobs from the message queue and transcode each video into multiple renditions using FFmpeg. Each rendition produces HLS segments (.ts files) and a playlist (.m3u8). [src3] [src7]

# FFmpeg command to transcode to 720p HLS with 6-second segments
ffmpeg -i input.mp4 \
  -vf scale=1280:720 \
  -c:v libx264 -preset medium -b:v 2800k \
  -c:a aac -b:a 128k \
  -hls_time 6 -hls_playlist_type vod \
  -hls_segment_filename "720p/seg_%04d.ts" \
  720p/playlist.m3u8

Verify: ffprobe 720p/playlist.m3u8 → shows stream info with correct resolution

3. Generate the master HLS manifest

Create a master playlist that references all rendition playlists, enabling adaptive bitrate switching in the player. [src7]

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
360p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1400000,RESOLUTION=854x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2800000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8

Verify: ffplay master.m3u8 → plays video with adaptive quality switching

4. Configure CDN and edge caching

Place your CDN in front of object storage. Video segments are immutable and should be cached aggressively (1 year). Manifests need shorter TTLs (30-60s) to allow content updates. [src1]

# nginx edge config: aggressive segment caching, short manifest TTL
location ~* \.ts$ {
    proxy_cache video_cache;
    proxy_cache_valid 200 365d;
    add_header Cache-Control "public, max-age=31536000, immutable";
}
location ~* \.m3u8$ {
    proxy_cache video_cache;
    proxy_cache_valid 200 60s;
    add_header Cache-Control "public, max-age=60";
}

Verify: curl -sI https://cdn.example.com/{video_id}/720p/seg_0001.ts | grep X-CacheHIT on second request

5. Implement the playback API and metadata service

Build the API that serves video metadata and playback URLs. Use signed URLs so only authenticated users can access content. [src2] [src6]

// Playback API — returns signed CDN URL for the master manifest
app.get("/api/v1/videos/:videoId/playback", async (req, res) => {
  const video = await db.query("SELECT * FROM videos WHERE id = $1 AND status = 'ready'", [req.params.videoId]);
  if (!video.rows.length) return res.status(404).json({ error: "Not found" });
  const manifestUrl = getSignedUrl({
    url: `https://cdn.example.com/${req.params.videoId}/master.m3u8`,
    dateLessThan: new Date(Date.now() + 4 * 3600 * 1000).toISOString(),
  });
  res.json({ id: video.rows[0].id, manifest_url: manifestUrl });
});

Verify: curl http://localhost:3000/api/v1/videos/{id}/playback → JSON with signed manifest_url

6. Add DRM encryption for premium content

For licensed content, encrypt segments during transcoding and set up a multi-DRM license server (Widevine for Chrome/Android, FairPlay for Safari/iOS, PlayReady for Edge/Windows). [src5]

# Encrypt HLS segments with AES-128 (basic; use Widevine/FairPlay for production)
openssl rand 16 > enc.key
echo "https://keys.example.com/${VIDEO_ID}/enc.key" > enc_keyinfo.txt
echo "enc.key" >> enc_keyinfo.txt
ffmpeg -i input.mp4 -c:v libx264 -b:v 3000k -c:a aac -b:a 128k \
  -hls_time 6 -hls_playlist_type vod \
  -hls_key_info_file enc_keyinfo.txt output_encrypted.m3u8

Verify: grep EXT-X-KEY output_encrypted.m3u8 → shows METHOD=AES-128,URI=...

Code Examples

Python: Video Upload Pipeline with S3 and Kafka

# Input:  Raw video file path, user metadata
# Output: video_id, transcode job published to Kafka

import boto3, json, uuid, hashlib
from kafka import KafkaProducer

class VideoUploadPipeline:
    def __init__(self, s3_bucket, kafka_brokers):
        self.s3 = boto3.client("s3")
        self.bucket = s3_bucket
        self.producer = KafkaProducer(
            bootstrap_servers=kafka_brokers,
            value_serializer=lambda v: json.dumps(v).encode(),
        )

    def upload(self, filepath, user_id, title):
        video_id = str(uuid.uuid4())
        s3_key = f"raw/{video_id}/{filepath.rsplit('/', 1)[-1]}"
        self.s3.upload_file(filepath, self.bucket, s3_key)
        job = {"video_id": video_id, "s3_key": s3_key,
               "renditions": ["360p", "480p", "720p", "1080p"]}
        self.producer.send("transcode-jobs", value=job)
        self.producer.flush()
        return job

JavaScript: HLS Master Manifest Generator

// Input:  video ID, list of completed renditions with metadata
// Output: M3U8 master manifest string

function generateMasterManifest(videoId, renditions) {
  const lines = ["#EXTM3U", "#EXT-X-VERSION:3", ""];
  const sorted = [...renditions].sort((a, b) => a.bandwidth - b.bandwidth);
  for (const r of sorted) {
    lines.push(`#EXT-X-STREAM-INF:BANDWIDTH=${r.bandwidth},RESOLUTION=${r.resolution},CODECS="${r.codecs}"`);
    lines.push(`${r.name}/playlist.m3u8`);
    lines.push("");
  }
  return lines.join("\n");
}

Go: Transcoding Worker with Job Queue

// Input:  Transcode job from Kafka (video_id, s3_key, renditions)
// Output: Transcoded HLS segments uploaded to S3

var ladder = map[string]Rendition{
    "360p":  {640, 360, "800k", "96k"},
    "720p":  {1280, 720, "2800k", "128k"},
    "1080p": {1920, 1080, "5000k", "192k"},
}

func transcodeRendition(inputPath, videoID, name string, r Rendition) error {
    outDir := filepath.Join("/tmp/transcode", videoID, name)
    os.MkdirAll(outDir, 0755)
    cmd := exec.Command("ffmpeg", "-i", inputPath,
        "-vf", fmt.Sprintf("scale=%d:%d", r.Width, r.Height),
        "-c:v", "libx264", "-preset", "medium", "-b:v", r.VideoBitrate,
        "-c:a", "aac", "-b:a", r.AudioBitrate,
        "-hls_time", "6", "-hls_playlist_type", "vod",
        "-hls_segment_filename", filepath.Join(outDir, "seg_%04d.ts"),
        filepath.Join(outDir, "playlist.m3u8"), "-y")
    return cmd.Run()
}

Anti-Patterns

Wrong: Synchronous transcoding during upload

# BAD — blocks the upload response until transcoding completes (minutes to hours)
@app.post("/upload")
def upload_video(file):
    save_to_disk(file)
    transcode_all_renditions(file.filename)    # Blocks for 5-30 minutes!
    return {"status": "ready"}                 # User waits forever, request times out

Correct: Async transcoding via message queue

# GOOD — upload returns immediately, transcoding happens asynchronously
@app.post("/upload")
def upload_video(file):
    video_id = save_to_s3(file)
    publish_to_queue("transcode-jobs", {"video_id": video_id})
    return {"status": "processing", "video_id": video_id}   # Returns in <1s

Wrong: Single bitrate streaming

# BAD — serves one resolution regardless of user bandwidth
def get_video_url(video_id):
    return f"https://cdn.example.com/{video_id}/video_1080p.mp4"
    # Mobile users on 3G get constant buffering

Correct: Adaptive bitrate with HLS/DASH encoding ladder

# GOOD — serves master manifest; player picks the right rendition
def get_video_url(video_id):
    return f"https://cdn.example.com/{video_id}/master.m3u8"
    # Player adapts quality based on available bandwidth

Wrong: Storing video blobs in a database

# BAD — stores video binary in PostgreSQL BYTEA column
def store_video(video_id, video_bytes):
    db.execute("INSERT INTO videos (id, data) VALUES (%s, %s)", (video_id, video_bytes))
    # Database bloats to TB, backups take hours, queries slow to a crawl

Correct: Object storage for blobs, database for metadata only

# GOOD — S3 for video files, PostgreSQL for metadata only
def store_video(video_id, filepath, title, user_id):
    s3.upload_file(filepath, "video-bucket", f"raw/{video_id}/{os.path.basename(filepath)}")
    db.execute("INSERT INTO videos (id, title, user_id, status) VALUES (%s,%s,%s,'uploaded')",
               (video_id, title, user_id))

Wrong: No CDN — serving directly from origin

# BAD — all viewers worldwide hit the single origin region
def get_stream_url(video_id):
    return f"https://us-east-1.s3.amazonaws.com/videos/{video_id}/master.m3u8"
    # 10K viewers x 5 Mbps = 50 Gbps origin bandwidth, $22,500/hour in S3 egress

Correct: CDN edge delivery with origin shielding

# GOOD — CDN serves from 200+ edge PoPs, origin handles <5% of requests
def get_stream_url(video_id):
    return f"https://cdn.example.com/{video_id}/master.m3u8"
    # 95%+ cache hit ratio, $0.01-0.02/GB vs $0.09/GB from S3 direct

Common Pitfalls

Diagnostic Commands

# Validate HLS manifest structure
ffprobe -v quiet -print_format json -show_streams master.m3u8

# Check video codec, bitrate, and resolution
ffprobe -v error -show_entries stream=codec_name,width,height,bit_rate -of csv input.mp4

# Measure CDN cache hit ratio
curl -sI https://cdn.example.com/video_id/720p/seg_0001.ts | grep -i x-cache

# Monitor Kafka transcoding queue depth
kafka-consumer-groups --bootstrap-server kafka:9092 --group transcode-workers --describe

# Check S3 bucket size for cost estimation
aws s3 ls s3://transcoded-video-segments --recursive --summarize | tail -2

# Test HLS playback
ffplay https://cdn.example.com/video_id/master.m3u8

# Validate DRM encryption on segments
ffprobe -v error -show_entries format_tags -of json encrypted_segment.ts

Version History & Compatibility

TechnologyStatusNotes
H.264/AVC + AACUniversal standardMaximum device compatibility; baseline for all platforms
H.265/HEVCWidely supported (2020+)30-40% smaller files; Apple ecosystem strong; licensing fees
AV1Growing adoption (2023+)50-60% smaller than H.264; royalty-free; encoding 50-100x slower
HLSIndustry standardApple-developed; works everywhere with wide player support
DASHIndustry standardMPEG-developed; primary for non-Apple; not native on iOS
CMAFEmerging standardUnifies HLS + DASH segments; reduces storage by 50%
Widevine L1/L3Current DRMGoogle-backed; Chrome, Android, smart TVs, Chromecast
FairPlay StreamingCurrent DRMApple-only; required for Safari and iOS DRM playback
PlayReadyCurrent DRMMicrosoft-backed; Edge, Xbox, Windows, some smart TVs

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Building a video-on-demand (VOD) platform at any scaleBuilding real-time video conferencing (Zoom/Teams)WebRTC-based SFU/MCU architecture
Content is pre-recorded and can be transcoded ahead of timeUltra-low latency live streaming required (<3s)LL-HLS, WebRTC, or RTMP-based live streaming
Adaptive bitrate playback across diverse devices and networksStreaming short clips (<30s) where quality switching has no time to helpProgressive download (single MP4 file)
Licensed content requires DRM protectionInternal/private video sharing within a small teamSimple S3 presigned URLs or Cloudflare Stream
>10K concurrent viewers expectedHosting a few dozen training videos for a small orgManaged platforms (Vimeo, Wistia, YouTube unlisted)

Important Caveats

Related Units