CDN architecture design

- Bottom line: A CDN is a geographically distributed network of reverse-proxy servers that cache content at edge locations close to users, reducing latency from hundreds of milliseconds to single-digit milliseconds for cached content.

content delivery network system design

- Bottom line: A CDN is a geographically distributed network of reverse-proxy servers that cache content at edge locations close to users, reducing latency from hundreds of milliseconds to single-digit milliseconds for cached content.

design a CDN from scratch

- Bottom line: A CDN is a geographically distributed network of reverse-proxy servers that cache content at edge locations close to users, reducing latency from hundreds of milliseconds to single-digit milliseconds for cached content.

CDN system design interview

- Bottom line: A CDN is a geographically distributed network of reverse-proxy servers that cache content at edge locations close to users, reducing latency from hundreds of milliseconds to single-digit milliseconds for cached content.

edge caching architecture

- Bottom line: A CDN is a geographically distributed network of reverse-proxy servers that cache content at edge locations close to users, reducing latency from hundreds of milliseconds to single-digit milliseconds for cached content.

How to Design a Content Delivery Network (CDN)

How do I design a content delivery network (CDN)?

TL;DR

Bottom line: A CDN is a geographically distributed network of reverse-proxy servers that cache content at edge locations close to users, reducing latency from hundreds of milliseconds to single-digit milliseconds for cached content.
Key tool/command: Cache-Control: public, max-age=31536000, immutable for versioned static assets; Surrogate-Key headers for tag-based cache invalidation.
Watch out for: Cache stampede (thundering herd) when popular content expires simultaneously across all edge nodes — use request coalescing or stale-while-revalidate.
Works with: Any HTTP/HTTPS origin (cloud, on-prem, object storage); all major providers (Cloudflare, CloudFront, Akamai, Fastly, Google Cloud CDN).

Constraints

Cache invalidation must be explicit and bounded — never rely solely on TTL expiry for time-sensitive content; always implement a purge mechanism
Origin servers must handle thundering herd when popular content expires simultaneously — use request coalescing or stale-while-revalidate
TLS termination at the edge requires distributing private keys or using keyless SSL — never send private keys to third-party edge nodes without encryption at rest
Anycast routing assumes symmetric paths — measure real-user latency, not just geographic distance
Cache key design must account for Vary headers (Accept-Encoding, Accept-Language) — ignoring Vary serves wrong content to users with different capabilities

Quick Reference

Component	Role	Technology Options	Scaling Strategy
DNS Resolution	Routes users to nearest PoP via geographic or latency-based routing	Anycast DNS, GeoDNS, Route 53, Cloudflare DNS	Anycast distributes automatically; add more PoPs for coverage
Edge Server (PoP)	Caches content and serves requests from the network edge	Nginx, Varnish, HAProxy, Envoy, custom	Horizontal: add servers per PoP; add PoPs globally
Origin Shield / Mid-Tier Cache	Intermediate cache layer between edge and origin; reduces origin load	Cloudflare Tiered Cache, CloudFront Regional Edge Cache, Varnish	Single shield per region; multiple for multi-region
Origin Server	Serves authoritative content when cache misses occur	Any HTTP server, S3, GCS, Azure Blob Storage	Vertical + horizontal; auto-scaling groups
Cache Key Generator	Determines cache entry identity from URL, headers, cookies, query params	Custom logic per CDN; Vary header processing	Configure per content type; normalize query strings
TLS Termination	Decrypts HTTPS at the edge to enable caching and inspection	Let's Encrypt, ACM, Keyless SSL, custom certs	Automated cert provisioning; SNI for multi-tenant
Cache Invalidation Service	Purges stale content across all edge nodes	Surrogate-Key (Fastly), Cache-Tag (Cloudflare), API purge (CloudFront)	Event-driven purge via webhooks; tag-based for surgical invalidation
Load Balancer	Distributes traffic across origin servers and handles failover	AWS ALB/NLB, Cloudflare Load Balancing, HAProxy, Envoy	Health checks + weighted routing; active-passive for DR
Edge Compute Runtime	Executes custom logic at the edge (auth, A/B testing, personalization)	Cloudflare Workers, Lambda@Edge, Fastly Compute, Akamai EdgeWorkers	Stateless functions; scale with request volume; KV for state
Observability Stack	Monitors cache hit ratio, latency, error rates, origin health	Prometheus + Grafana, Datadog, Cloudflare Analytics, CloudWatch	Aggregate per-PoP metrics; alert on hit ratio drops
Request Coalescing	Collapses duplicate cache-miss requests to origin into single fetch	Varnish grace mode, Nginx proxy_cache_lock, Cloudflare built-in	Automatic per-PoP; tune lock timeout for origin response time
Content Compression	Reduces transfer size for text-based assets at the edge	Brotli, gzip, zstd; on-the-fly or pre-compress	Compress at edge for dynamic; pre-compress at origin for static
WAF / DDoS Protection	Filters malicious traffic before it reaches origin	Cloudflare WAF, AWS Shield, Akamai Kona, Fastly Signal Sciences	Anycast absorbs volumetric; WAF rules for app-layer

Decision Tree

START
├── Primarily serving static assets (images, CSS, JS, fonts)?
│   ├── YES → Pull-based CDN with long TTLs + versioned URLs
│   │   ├── < 1K req/s → Single-provider CDN (Cloudflare free, CloudFront)
│   │   └── > 1K req/s → Add origin shield + tiered caching
│   └── NO ↓
├── Serving video or large file downloads?
│   ├── YES → Specialized media CDN with range-request support
│   │   ├── Live streaming → HLS/DASH with short-segment caching (2-6s TTL)
│   │   └── VOD → Long TTLs + byte-range caching + pre-warming
│   └── NO ↓
├── Need dynamic/personalized content at the edge?
│   ├── YES → Edge compute (Workers, Lambda@Edge, Fastly Compute)
│   │   ├── < 10 personalization dimensions → Cache variants with custom keys
│   │   └── Highly dynamic → ESI or edge compute assembly
│   └── NO ↓
├── Need multi-cloud or avoid vendor lock-in?
│   ├── YES → Multi-CDN with DNS-based traffic steering
│   └── NO ↓
└── DEFAULT → Pull-based CDN + origin shield + tag-based invalidation + stale-while-revalidate

Step-by-Step Guide

1. Define your content taxonomy and caching strategy

Classify all content into caching tiers based on mutability and personalization. This determines TTLs, cache keys, and invalidation strategies for each content type. [src1]

content_tiers:
  immutable_assets:
    pattern: "*.{js,css,woff2,png,jpg,webp}"
    cache_control: "public, max-age=31536000, immutable"
    invalidation: "Deploy new filename (never purge)"
  semi_static:
    pattern: "*.html, /api/catalog"
    cache_control: "public, max-age=300, stale-while-revalidate=86400"
    invalidation: "Tag-based purge on publish"
  personalized:
    pattern: "/dashboard/*, /api/user/*"
    cache_control: "private, no-store"
  api_responses:
    pattern: "/api/v1/*"
    cache_control: "public, max-age=60, stale-if-error=300"
    invalidation: "Short TTL + event-driven purge"

Verify: curl -sI https://origin.example.com/assets/app.js | grep -i cache-control → expected: Cache-Control: public, max-age=31536000, immutable

2. Design the edge network topology

Choose between single-tier (edge-only) and multi-tier (edge + origin shield) based on your origin's capacity. For most production systems, a two-tier architecture is recommended. [src1] [src3]

         Client
           │ DNS (Anycast)
     Edge PoP (L1 Cache — hot content, small)
           │ Cache MISS
     Origin Shield (L2 Cache — warm content, large, collapses duplicate misses)
           │ Cache MISS (collapsed)
     Origin Server (authoritative source)

Verify: Check response headers for CF-Cache-Status: HIT or X-Cache: Hit from cloudfront to confirm multi-tier caching is active.

3. Implement cache key normalization

Poor cache key design is the #1 cause of low hit ratios. Normalize query parameters and strip tracking parameters. [src2]

# Nginx: strip marketing query params, normalize cache key
map $args $normalized_args {
    default $args;
    "~*^(.*)(?:&|^)(utm_[^&]*|fbclid[^&]*|gclid[^&]*)(.*)$" "$1$3";
}
proxy_cache_key "$scheme$request_method$host$uri$normalized_args";

Verify: Both ?b=2&a=1 and ?a=1&b=2 should produce the same cache key.

4. Set up cache invalidation with surrogate keys

Tag-based invalidation lets you surgically purge related content. Attach surrogate keys to every response from your origin. [src4] [src7]

# FastAPI: Attach surrogate keys for tag-based purging
@app.get("/products/{product_id}")
async def get_product(product_id: str, response: Response):
    product = await fetch_product(product_id)
    response.headers["Surrogate-Key"] = (
        f"product-{product_id} category-{product.category} all-products"
    )
    response.headers["Cache-Control"] = "public, max-age=3600, stale-while-revalidate=86400"
    return product

Verify: curl -sI https://cdn.example.com/products/123 | grep -i surrogate-key

5. Implement request coalescing to prevent cache stampede

When a popular cache entry expires, hundreds of requests can overwhelm the origin. Configure request coalescing so only one request goes to origin. [src1]

# Nginx: Request coalescing with proxy_cache_lock
proxy_cache_lock on;              # Only 1 request to origin per cache key
proxy_cache_lock_timeout 5s;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
proxy_cache_background_update on; # Refresh cache in background

Verify: Simulate concurrent requests with ab -n 100 -c 50 and check origin logs — should see only 1 request.

6. Configure edge compute for dynamic logic

Edge compute lets you run auth, A/B testing, and header manipulation without round-tripping to origin. [src5] [src6]

// Cloudflare Workers: Edge A/B testing with sticky assignment
export default {
  async fetch(request, env) {
    const cookies = request.headers.get("Cookie") || "";
    const match = cookies.match(/ab_variant=([AB])/);
    let variant = match ? match[1] : (Math.random() < 0.5 ? "A" : "B");
    const origins = { A: "https://origin-a.example.com", B: "https://origin-b.example.com" };
    const response = await fetch(origins[variant] + new URL(request.url).pathname);
    const newResponse = new Response(response.body, response);
    if (!match) newResponse.headers.append("Set-Cookie", `ab_variant=${variant}; Path=/; Max-Age=86400; Secure; HttpOnly`);
    return newResponse;
  },
};

Verify: curl -v https://cdn.example.com/test-page 2>&1 | grep -i 'set-cookie.*ab_variant'

Code Examples

Python: Programmatic cache purge via Cloudflare API

# Input:  List of URLs or cache tags to purge
# Output: Purge confirmation from Cloudflare API
import httpx  # httpx==0.27.0

async def purge_by_tags(zone_id: str, api_token: str, tags: list[str]) -> dict:
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            f"https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache",
            headers={"Authorization": f"Bearer {api_token}"},
            json={"tags": tags},  # Up to 30 tags per request
        )
        resp.raise_for_status()
        return resp.json()

TypeScript: CloudFront invalidation via AWS SDK

// Input:  Array of path patterns to invalidate
// Output: Invalidation ID for tracking
import { CloudFrontClient, CreateInvalidationCommand } from "@aws-sdk/client-cloudfront";

const client = new CloudFrontClient({ region: "us-east-1" });
async function invalidate(distId: string, paths: string[]): Promise<string> {
  const cmd = new CreateInvalidationCommand({
    DistributionId: distId,
    InvalidationBatch: {
      CallerReference: `inv-${Date.now()}`,
      Paths: { Quantity: paths.length, Items: paths },
    },
  });
  const resp = await client.send(cmd);
  return resp.Invalidation?.Id ?? "unknown";
}

Go: Request coalescing with singleflight

// Input:  Concurrent requests for the same cache key
// Output: Single origin fetch, result shared across all waiters
import "golang.org/x/sync/singleflight" // v0.6.0

var group singleflight.Group

func handler(w http.ResponseWriter, r *http.Request) {
    result, err, shared := group.Do(r.URL.Path, func() (interface{}, error) {
        resp, err := http.Get("http://origin:8080" + r.URL.RequestURI())
        if err != nil { return nil, err }
        defer resp.Body.Close()
        return io.ReadAll(resp.Body)
    })
    if err != nil { http.Error(w, "Origin unavailable", 502); return }
    if shared { w.Header().Set("X-Coalesced", "true") }
    w.Write(result.([]byte))
}

Anti-Patterns

Wrong: Purging entire cache on every deploy

# ❌ BAD — nuclear purge destroys all cached content globally
curl -X POST ".../purge_cache" -d '{"purge_everything": true}'

Correct: Versioned filenames for assets, tag-based purge for HTML

# ✅ GOOD — only purge mutable content; assets use content-hash filenames
curl -X POST ".../purge_cache" -d '{"tags": ["page-html", "api-catalog"]}'

Wrong: Same TTL for all content types

# ❌ BAD — one TTL for everything
location / { add_header Cache-Control "public, max-age=3600"; }

Correct: Tiered TTLs based on content mutability

# ✅ GOOD — match TTL to content lifecycle
location ~* \.(js|css|woff2|png)$ { add_header Cache-Control "public, max-age=31536000, immutable"; }
location ~* \.html$              { add_header Cache-Control "public, max-age=300, stale-while-revalidate=86400"; }
location /api/                   { add_header Cache-Control "public, max-age=60, stale-if-error=300"; }

Wrong: Caching responses with Set-Cookie

# ❌ BAD — every user gets the same session cookie (session hijacking)
response.headers["Cache-Control"] = "public, max-age=3600"
response.set_cookie("session_id", generate_session())

Correct: Private/no-store for responses with cookies

# ✅ GOOD — user-specific responses must not be cached publicly
response.headers["Cache-Control"] = "private, no-store"
response.set_cookie("session_id", generate_session())

Wrong: Query string cache busting without normalization

<!-- ❌ BAD — utm params create infinite cache variants -->
<link rel="stylesheet" href="/style.css?v=1">

Correct: Content-hash filenames + strip tracking params

<!-- ✅ GOOD — content hash in filename; no query string needed -->
<link rel="stylesheet" href="/style.d4e5f6.css">

Common Pitfalls

Low cache hit ratio from excessive Vary headers: Vary: * or Vary: Cookie creates a separate entry per unique value, effectively disabling caching. Fix: Strip unnecessary Vary headers; only vary on Accept-Encoding. [src1]
Cache stampede after TTL expiry: Popular resource expires across all PoPs simultaneously, hundreds of requests hit origin. Fix: stale-while-revalidate + proxy_cache_lock. [src2]
Mixed content after CDN adoption: Origin serves HTTP links while CDN terminates TLS. Fix: Force HTTPS at origin; set Content-Security-Policy: upgrade-insecure-requests. [src2]
Cache poisoning via unkeyed headers: Cache key ignores a header that affects the response (e.g., X-Forwarded-Host). Fix: Include all response-affecting headers in cache key or normalize at edge. [src1]
Stale content after origin update: CDN still serves old version because TTL has not expired. Fix: Active cache invalidation via CI/CD pipeline; do not rely solely on TTL. [src4]
CORS failures on CDN-served assets: CDN strips or does not forward CORS headers. Fix: Configure CDN to forward Access-Control-Allow-Origin from origin. [src2]
CloudFront invalidation cost surprise: $0.005/path after first 1,000 free/month. Fix: Versioned filenames for assets; batch HTML invalidations with wildcards. [src3]
Edge compute cold starts: Lambda@Edge adds 100-500ms on cold starts. Fix: Use isolate-based compute (Workers, Fastly Compute) for latency-sensitive paths. [src6]

Diagnostic Commands

# Check cache status of a specific URL
curl -sI https://cdn.example.com/page | grep -iE 'cache-control|x-cache|cf-cache-status|age|vary'

# Test origin response without CDN (bypass cache)
curl -sI https://cdn.example.com/page -H "Cache-Control: no-cache" -H "Pragma: no-cache"

# Verify surrogate keys on origin response
curl -sI https://origin.example.com/products/123 | grep -i surrogate-key

# Check compression is active
curl -sI https://cdn.example.com/style.css -H "Accept-Encoding: br,gzip" | grep -i content-encoding

# Trace CDN routing (which PoP served the request)
curl -sI https://cdn.example.com/ | grep -iE 'cf-ray|x-served-by|x-amz-cf-pop|server-timing'

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Serving static assets to a global audience	All users in same datacenter as origin	Local reverse proxy (Nginx/Varnish)
Reducing origin load during traffic spikes	Content is 100% personalized	Application-level caching (Redis) + SSR
Improving TTFB for geographically distributed users	Data sovereignty requires content in specific country	Regional deployment with geo-fencing
Protecting origin from DDoS via anycast	Real-time websocket or long-polling connections	Direct origin connection
Offloading TLS termination and cert management	Content changes on every request (no cacheability)	API gateway with rate limiting
Running lightweight edge logic (auth, redirects, A/B)	Complex SSR with database access at the edge	Edge-native DBs (D1, Turso) + full-stack edge framework

Important Caveats

CDN cache hit ratios below 80% indicate a configuration problem — most common causes are excessive Vary headers, unstripped query parameters, and Set-Cookie on cacheable responses
Multi-CDN setups require careful DNS TTL management — if DNS TTL exceeds failover detection time, users continue hitting the failed CDN
Edge compute pricing varies dramatically — Cloudflare Workers per-request (included in plans), Lambda@Edge per-request + duration + memory, Fastly Compute per-request + compute time
Cache invalidation propagation is not instant — Cloudflare ~30s globally, CloudFront 5-15 minutes, some CDNs only guarantee best effort
Origin shield adds latency to cache misses (extra hop) but dramatically reduces origin load — enable when origin protection matters more than miss latency