How to Design a Content Delivery Network (CDN)

Type: Software Reference Confidence: 0.92 Sources: 7 Verified: 2026-02-23 Freshness: quarterly

TL;DR

Constraints

Quick Reference

ComponentRoleTechnology OptionsScaling Strategy
DNS ResolutionRoutes users to nearest PoP via geographic or latency-based routingAnycast DNS, GeoDNS, Route 53, Cloudflare DNSAnycast distributes automatically; add more PoPs for coverage
Edge Server (PoP)Caches content and serves requests from the network edgeNginx, Varnish, HAProxy, Envoy, customHorizontal: add servers per PoP; add PoPs globally
Origin Shield / Mid-Tier CacheIntermediate cache layer between edge and origin; reduces origin loadCloudflare Tiered Cache, CloudFront Regional Edge Cache, VarnishSingle shield per region; multiple for multi-region
Origin ServerServes authoritative content when cache misses occurAny HTTP server, S3, GCS, Azure Blob StorageVertical + horizontal; auto-scaling groups
Cache Key GeneratorDetermines cache entry identity from URL, headers, cookies, query paramsCustom logic per CDN; Vary header processingConfigure per content type; normalize query strings
TLS TerminationDecrypts HTTPS at the edge to enable caching and inspectionLet's Encrypt, ACM, Keyless SSL, custom certsAutomated cert provisioning; SNI for multi-tenant
Cache Invalidation ServicePurges stale content across all edge nodesSurrogate-Key (Fastly), Cache-Tag (Cloudflare), API purge (CloudFront)Event-driven purge via webhooks; tag-based for surgical invalidation
Load BalancerDistributes traffic across origin servers and handles failoverAWS ALB/NLB, Cloudflare Load Balancing, HAProxy, EnvoyHealth checks + weighted routing; active-passive for DR
Edge Compute RuntimeExecutes custom logic at the edge (auth, A/B testing, personalization)Cloudflare Workers, Lambda@Edge, Fastly Compute, Akamai EdgeWorkersStateless functions; scale with request volume; KV for state
Observability StackMonitors cache hit ratio, latency, error rates, origin healthPrometheus + Grafana, Datadog, Cloudflare Analytics, CloudWatchAggregate per-PoP metrics; alert on hit ratio drops
Request CoalescingCollapses duplicate cache-miss requests to origin into single fetchVarnish grace mode, Nginx proxy_cache_lock, Cloudflare built-inAutomatic per-PoP; tune lock timeout for origin response time
Content CompressionReduces transfer size for text-based assets at the edgeBrotli, gzip, zstd; on-the-fly or pre-compressCompress at edge for dynamic; pre-compress at origin for static
WAF / DDoS ProtectionFilters malicious traffic before it reaches originCloudflare WAF, AWS Shield, Akamai Kona, Fastly Signal SciencesAnycast absorbs volumetric; WAF rules for app-layer

Decision Tree

START
├── Primarily serving static assets (images, CSS, JS, fonts)?
│   ├── YES → Pull-based CDN with long TTLs + versioned URLs
│   │   ├── < 1K req/s → Single-provider CDN (Cloudflare free, CloudFront)
│   │   └── > 1K req/s → Add origin shield + tiered caching
│   └── NO ↓
├── Serving video or large file downloads?
│   ├── YES → Specialized media CDN with range-request support
│   │   ├── Live streaming → HLS/DASH with short-segment caching (2-6s TTL)
│   │   └── VOD → Long TTLs + byte-range caching + pre-warming
│   └── NO ↓
├── Need dynamic/personalized content at the edge?
│   ├── YES → Edge compute (Workers, Lambda@Edge, Fastly Compute)
│   │   ├── < 10 personalization dimensions → Cache variants with custom keys
│   │   └── Highly dynamic → ESI or edge compute assembly
│   └── NO ↓
├── Need multi-cloud or avoid vendor lock-in?
│   ├── YES → Multi-CDN with DNS-based traffic steering
│   └── NO ↓
└── DEFAULT → Pull-based CDN + origin shield + tag-based invalidation + stale-while-revalidate

Step-by-Step Guide

1. Define your content taxonomy and caching strategy

Classify all content into caching tiers based on mutability and personalization. This determines TTLs, cache keys, and invalidation strategies for each content type. [src1]

content_tiers:
  immutable_assets:
    pattern: "*.{js,css,woff2,png,jpg,webp}"
    cache_control: "public, max-age=31536000, immutable"
    invalidation: "Deploy new filename (never purge)"
  semi_static:
    pattern: "*.html, /api/catalog"
    cache_control: "public, max-age=300, stale-while-revalidate=86400"
    invalidation: "Tag-based purge on publish"
  personalized:
    pattern: "/dashboard/*, /api/user/*"
    cache_control: "private, no-store"
  api_responses:
    pattern: "/api/v1/*"
    cache_control: "public, max-age=60, stale-if-error=300"
    invalidation: "Short TTL + event-driven purge"

Verify: curl -sI https://origin.example.com/assets/app.js | grep -i cache-control → expected: Cache-Control: public, max-age=31536000, immutable

2. Design the edge network topology

Choose between single-tier (edge-only) and multi-tier (edge + origin shield) based on your origin's capacity. For most production systems, a two-tier architecture is recommended. [src1] [src3]

         Client
           │ DNS (Anycast)
     Edge PoP (L1 Cache — hot content, small)
           │ Cache MISS
     Origin Shield (L2 Cache — warm content, large, collapses duplicate misses)
           │ Cache MISS (collapsed)
     Origin Server (authoritative source)

Verify: Check response headers for CF-Cache-Status: HIT or X-Cache: Hit from cloudfront to confirm multi-tier caching is active.

3. Implement cache key normalization

Poor cache key design is the #1 cause of low hit ratios. Normalize query parameters and strip tracking parameters. [src2]

# Nginx: strip marketing query params, normalize cache key
map $args $normalized_args {
    default $args;
    "~*^(.*)(?:&|^)(utm_[^&]*|fbclid[^&]*|gclid[^&]*)(.*)$" "$1$3";
}
proxy_cache_key "$scheme$request_method$host$uri$normalized_args";

Verify: Both ?b=2&a=1 and ?a=1&b=2 should produce the same cache key.

4. Set up cache invalidation with surrogate keys

Tag-based invalidation lets you surgically purge related content. Attach surrogate keys to every response from your origin. [src4] [src7]

# FastAPI: Attach surrogate keys for tag-based purging
@app.get("/products/{product_id}")
async def get_product(product_id: str, response: Response):
    product = await fetch_product(product_id)
    response.headers["Surrogate-Key"] = (
        f"product-{product_id} category-{product.category} all-products"
    )
    response.headers["Cache-Control"] = "public, max-age=3600, stale-while-revalidate=86400"
    return product

Verify: curl -sI https://cdn.example.com/products/123 | grep -i surrogate-key

5. Implement request coalescing to prevent cache stampede

When a popular cache entry expires, hundreds of requests can overwhelm the origin. Configure request coalescing so only one request goes to origin. [src1]

# Nginx: Request coalescing with proxy_cache_lock
proxy_cache_lock on;              # Only 1 request to origin per cache key
proxy_cache_lock_timeout 5s;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
proxy_cache_background_update on; # Refresh cache in background

Verify: Simulate concurrent requests with ab -n 100 -c 50 and check origin logs — should see only 1 request.

6. Configure edge compute for dynamic logic

Edge compute lets you run auth, A/B testing, and header manipulation without round-tripping to origin. [src5] [src6]

// Cloudflare Workers: Edge A/B testing with sticky assignment
export default {
  async fetch(request, env) {
    const cookies = request.headers.get("Cookie") || "";
    const match = cookies.match(/ab_variant=([AB])/);
    let variant = match ? match[1] : (Math.random() < 0.5 ? "A" : "B");
    const origins = { A: "https://origin-a.example.com", B: "https://origin-b.example.com" };
    const response = await fetch(origins[variant] + new URL(request.url).pathname);
    const newResponse = new Response(response.body, response);
    if (!match) newResponse.headers.append("Set-Cookie", `ab_variant=${variant}; Path=/; Max-Age=86400; Secure; HttpOnly`);
    return newResponse;
  },
};

Verify: curl -v https://cdn.example.com/test-page 2>&1 | grep -i 'set-cookie.*ab_variant'

Code Examples

Python: Programmatic cache purge via Cloudflare API

# Input:  List of URLs or cache tags to purge
# Output: Purge confirmation from Cloudflare API
import httpx  # httpx==0.27.0

async def purge_by_tags(zone_id: str, api_token: str, tags: list[str]) -> dict:
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            f"https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache",
            headers={"Authorization": f"Bearer {api_token}"},
            json={"tags": tags},  # Up to 30 tags per request
        )
        resp.raise_for_status()
        return resp.json()

TypeScript: CloudFront invalidation via AWS SDK

// Input:  Array of path patterns to invalidate
// Output: Invalidation ID for tracking
import { CloudFrontClient, CreateInvalidationCommand } from "@aws-sdk/client-cloudfront";

const client = new CloudFrontClient({ region: "us-east-1" });
async function invalidate(distId: string, paths: string[]): Promise<string> {
  const cmd = new CreateInvalidationCommand({
    DistributionId: distId,
    InvalidationBatch: {
      CallerReference: `inv-${Date.now()}`,
      Paths: { Quantity: paths.length, Items: paths },
    },
  });
  const resp = await client.send(cmd);
  return resp.Invalidation?.Id ?? "unknown";
}

Go: Request coalescing with singleflight

// Input:  Concurrent requests for the same cache key
// Output: Single origin fetch, result shared across all waiters
import "golang.org/x/sync/singleflight" // v0.6.0

var group singleflight.Group

func handler(w http.ResponseWriter, r *http.Request) {
    result, err, shared := group.Do(r.URL.Path, func() (interface{}, error) {
        resp, err := http.Get("http://origin:8080" + r.URL.RequestURI())
        if err != nil { return nil, err }
        defer resp.Body.Close()
        return io.ReadAll(resp.Body)
    })
    if err != nil { http.Error(w, "Origin unavailable", 502); return }
    if shared { w.Header().Set("X-Coalesced", "true") }
    w.Write(result.([]byte))
}

Anti-Patterns

Wrong: Purging entire cache on every deploy

# ❌ BAD — nuclear purge destroys all cached content globally
curl -X POST ".../purge_cache" -d '{"purge_everything": true}'

Correct: Versioned filenames for assets, tag-based purge for HTML

# ✅ GOOD — only purge mutable content; assets use content-hash filenames
curl -X POST ".../purge_cache" -d '{"tags": ["page-html", "api-catalog"]}'

Wrong: Same TTL for all content types

# ❌ BAD — one TTL for everything
location / { add_header Cache-Control "public, max-age=3600"; }

Correct: Tiered TTLs based on content mutability

# ✅ GOOD — match TTL to content lifecycle
location ~* \.(js|css|woff2|png)$ { add_header Cache-Control "public, max-age=31536000, immutable"; }
location ~* \.html$              { add_header Cache-Control "public, max-age=300, stale-while-revalidate=86400"; }
location /api/                   { add_header Cache-Control "public, max-age=60, stale-if-error=300"; }

Wrong: Caching responses with Set-Cookie

# ❌ BAD — every user gets the same session cookie (session hijacking)
response.headers["Cache-Control"] = "public, max-age=3600"
response.set_cookie("session_id", generate_session())

Correct: Private/no-store for responses with cookies

# ✅ GOOD — user-specific responses must not be cached publicly
response.headers["Cache-Control"] = "private, no-store"
response.set_cookie("session_id", generate_session())

Wrong: Query string cache busting without normalization

<!-- ❌ BAD — utm params create infinite cache variants -->
<link rel="stylesheet" href="/style.css?v=1">

Correct: Content-hash filenames + strip tracking params

<!-- ✅ GOOD — content hash in filename; no query string needed -->
<link rel="stylesheet" href="/style.d4e5f6.css">

Common Pitfalls

Diagnostic Commands

# Check cache status of a specific URL
curl -sI https://cdn.example.com/page | grep -iE 'cache-control|x-cache|cf-cache-status|age|vary'

# Test origin response without CDN (bypass cache)
curl -sI https://cdn.example.com/page -H "Cache-Control: no-cache" -H "Pragma: no-cache"

# Verify surrogate keys on origin response
curl -sI https://origin.example.com/products/123 | grep -i surrogate-key

# Check compression is active
curl -sI https://cdn.example.com/style.css -H "Accept-Encoding: br,gzip" | grep -i content-encoding

# Trace CDN routing (which PoP served the request)
curl -sI https://cdn.example.com/ | grep -iE 'cf-ray|x-served-by|x-amz-cf-pop|server-timing'

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Serving static assets to a global audienceAll users in same datacenter as originLocal reverse proxy (Nginx/Varnish)
Reducing origin load during traffic spikesContent is 100% personalizedApplication-level caching (Redis) + SSR
Improving TTFB for geographically distributed usersData sovereignty requires content in specific countryRegional deployment with geo-fencing
Protecting origin from DDoS via anycastReal-time websocket or long-polling connectionsDirect origin connection
Offloading TLS termination and cert managementContent changes on every request (no cacheability)API gateway with rate limiting
Running lightweight edge logic (auth, redirects, A/B)Complex SSR with database access at the edgeEdge-native DBs (D1, Turso) + full-stack edge framework

Important Caveats

Related Units