Cache-Control: public, max-age=31536000, immutable for versioned static assets; Surrogate-Key headers for tag-based cache invalidation.Vary headers (Accept-Encoding, Accept-Language) — ignoring Vary serves wrong content to users with different capabilities| Component | Role | Technology Options | Scaling Strategy |
|---|---|---|---|
| DNS Resolution | Routes users to nearest PoP via geographic or latency-based routing | Anycast DNS, GeoDNS, Route 53, Cloudflare DNS | Anycast distributes automatically; add more PoPs for coverage |
| Edge Server (PoP) | Caches content and serves requests from the network edge | Nginx, Varnish, HAProxy, Envoy, custom | Horizontal: add servers per PoP; add PoPs globally |
| Origin Shield / Mid-Tier Cache | Intermediate cache layer between edge and origin; reduces origin load | Cloudflare Tiered Cache, CloudFront Regional Edge Cache, Varnish | Single shield per region; multiple for multi-region |
| Origin Server | Serves authoritative content when cache misses occur | Any HTTP server, S3, GCS, Azure Blob Storage | Vertical + horizontal; auto-scaling groups |
| Cache Key Generator | Determines cache entry identity from URL, headers, cookies, query params | Custom logic per CDN; Vary header processing | Configure per content type; normalize query strings |
| TLS Termination | Decrypts HTTPS at the edge to enable caching and inspection | Let's Encrypt, ACM, Keyless SSL, custom certs | Automated cert provisioning; SNI for multi-tenant |
| Cache Invalidation Service | Purges stale content across all edge nodes | Surrogate-Key (Fastly), Cache-Tag (Cloudflare), API purge (CloudFront) | Event-driven purge via webhooks; tag-based for surgical invalidation |
| Load Balancer | Distributes traffic across origin servers and handles failover | AWS ALB/NLB, Cloudflare Load Balancing, HAProxy, Envoy | Health checks + weighted routing; active-passive for DR |
| Edge Compute Runtime | Executes custom logic at the edge (auth, A/B testing, personalization) | Cloudflare Workers, Lambda@Edge, Fastly Compute, Akamai EdgeWorkers | Stateless functions; scale with request volume; KV for state |
| Observability Stack | Monitors cache hit ratio, latency, error rates, origin health | Prometheus + Grafana, Datadog, Cloudflare Analytics, CloudWatch | Aggregate per-PoP metrics; alert on hit ratio drops |
| Request Coalescing | Collapses duplicate cache-miss requests to origin into single fetch | Varnish grace mode, Nginx proxy_cache_lock, Cloudflare built-in | Automatic per-PoP; tune lock timeout for origin response time |
| Content Compression | Reduces transfer size for text-based assets at the edge | Brotli, gzip, zstd; on-the-fly or pre-compress | Compress at edge for dynamic; pre-compress at origin for static |
| WAF / DDoS Protection | Filters malicious traffic before it reaches origin | Cloudflare WAF, AWS Shield, Akamai Kona, Fastly Signal Sciences | Anycast absorbs volumetric; WAF rules for app-layer |
START
├── Primarily serving static assets (images, CSS, JS, fonts)?
│ ├── YES → Pull-based CDN with long TTLs + versioned URLs
│ │ ├── < 1K req/s → Single-provider CDN (Cloudflare free, CloudFront)
│ │ └── > 1K req/s → Add origin shield + tiered caching
│ └── NO ↓
├── Serving video or large file downloads?
│ ├── YES → Specialized media CDN with range-request support
│ │ ├── Live streaming → HLS/DASH with short-segment caching (2-6s TTL)
│ │ └── VOD → Long TTLs + byte-range caching + pre-warming
│ └── NO ↓
├── Need dynamic/personalized content at the edge?
│ ├── YES → Edge compute (Workers, Lambda@Edge, Fastly Compute)
│ │ ├── < 10 personalization dimensions → Cache variants with custom keys
│ │ └── Highly dynamic → ESI or edge compute assembly
│ └── NO ↓
├── Need multi-cloud or avoid vendor lock-in?
│ ├── YES → Multi-CDN with DNS-based traffic steering
│ └── NO ↓
└── DEFAULT → Pull-based CDN + origin shield + tag-based invalidation + stale-while-revalidate
Classify all content into caching tiers based on mutability and personalization. This determines TTLs, cache keys, and invalidation strategies for each content type. [src1]
content_tiers:
immutable_assets:
pattern: "*.{js,css,woff2,png,jpg,webp}"
cache_control: "public, max-age=31536000, immutable"
invalidation: "Deploy new filename (never purge)"
semi_static:
pattern: "*.html, /api/catalog"
cache_control: "public, max-age=300, stale-while-revalidate=86400"
invalidation: "Tag-based purge on publish"
personalized:
pattern: "/dashboard/*, /api/user/*"
cache_control: "private, no-store"
api_responses:
pattern: "/api/v1/*"
cache_control: "public, max-age=60, stale-if-error=300"
invalidation: "Short TTL + event-driven purge"
Verify: curl -sI https://origin.example.com/assets/app.js | grep -i cache-control → expected: Cache-Control: public, max-age=31536000, immutable
Choose between single-tier (edge-only) and multi-tier (edge + origin shield) based on your origin's capacity. For most production systems, a two-tier architecture is recommended. [src1] [src3]
Client
│ DNS (Anycast)
Edge PoP (L1 Cache — hot content, small)
│ Cache MISS
Origin Shield (L2 Cache — warm content, large, collapses duplicate misses)
│ Cache MISS (collapsed)
Origin Server (authoritative source)
Verify: Check response headers for CF-Cache-Status: HIT or X-Cache: Hit from cloudfront to confirm multi-tier caching is active.
Poor cache key design is the #1 cause of low hit ratios. Normalize query parameters and strip tracking parameters. [src2]
# Nginx: strip marketing query params, normalize cache key
map $args $normalized_args {
default $args;
"~*^(.*)(?:&|^)(utm_[^&]*|fbclid[^&]*|gclid[^&]*)(.*)$" "$1$3";
}
proxy_cache_key "$scheme$request_method$host$uri$normalized_args";
Verify: Both ?b=2&a=1 and ?a=1&b=2 should produce the same cache key.
Tag-based invalidation lets you surgically purge related content. Attach surrogate keys to every response from your origin. [src4] [src7]
# FastAPI: Attach surrogate keys for tag-based purging
@app.get("/products/{product_id}")
async def get_product(product_id: str, response: Response):
product = await fetch_product(product_id)
response.headers["Surrogate-Key"] = (
f"product-{product_id} category-{product.category} all-products"
)
response.headers["Cache-Control"] = "public, max-age=3600, stale-while-revalidate=86400"
return product
Verify: curl -sI https://cdn.example.com/products/123 | grep -i surrogate-key
When a popular cache entry expires, hundreds of requests can overwhelm the origin. Configure request coalescing so only one request goes to origin. [src1]
# Nginx: Request coalescing with proxy_cache_lock
proxy_cache_lock on; # Only 1 request to origin per cache key
proxy_cache_lock_timeout 5s;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
proxy_cache_background_update on; # Refresh cache in background
Verify: Simulate concurrent requests with ab -n 100 -c 50 and check origin logs — should see only 1 request.
Edge compute lets you run auth, A/B testing, and header manipulation without round-tripping to origin. [src5] [src6]
// Cloudflare Workers: Edge A/B testing with sticky assignment
export default {
async fetch(request, env) {
const cookies = request.headers.get("Cookie") || "";
const match = cookies.match(/ab_variant=([AB])/);
let variant = match ? match[1] : (Math.random() < 0.5 ? "A" : "B");
const origins = { A: "https://origin-a.example.com", B: "https://origin-b.example.com" };
const response = await fetch(origins[variant] + new URL(request.url).pathname);
const newResponse = new Response(response.body, response);
if (!match) newResponse.headers.append("Set-Cookie", `ab_variant=${variant}; Path=/; Max-Age=86400; Secure; HttpOnly`);
return newResponse;
},
};
Verify: curl -v https://cdn.example.com/test-page 2>&1 | grep -i 'set-cookie.*ab_variant'
# Input: List of URLs or cache tags to purge
# Output: Purge confirmation from Cloudflare API
import httpx # httpx==0.27.0
async def purge_by_tags(zone_id: str, api_token: str, tags: list[str]) -> dict:
async with httpx.AsyncClient() as client:
resp = await client.post(
f"https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache",
headers={"Authorization": f"Bearer {api_token}"},
json={"tags": tags}, # Up to 30 tags per request
)
resp.raise_for_status()
return resp.json()
// Input: Array of path patterns to invalidate
// Output: Invalidation ID for tracking
import { CloudFrontClient, CreateInvalidationCommand } from "@aws-sdk/client-cloudfront";
const client = new CloudFrontClient({ region: "us-east-1" });
async function invalidate(distId: string, paths: string[]): Promise<string> {
const cmd = new CreateInvalidationCommand({
DistributionId: distId,
InvalidationBatch: {
CallerReference: `inv-${Date.now()}`,
Paths: { Quantity: paths.length, Items: paths },
},
});
const resp = await client.send(cmd);
return resp.Invalidation?.Id ?? "unknown";
}
// Input: Concurrent requests for the same cache key
// Output: Single origin fetch, result shared across all waiters
import "golang.org/x/sync/singleflight" // v0.6.0
var group singleflight.Group
func handler(w http.ResponseWriter, r *http.Request) {
result, err, shared := group.Do(r.URL.Path, func() (interface{}, error) {
resp, err := http.Get("http://origin:8080" + r.URL.RequestURI())
if err != nil { return nil, err }
defer resp.Body.Close()
return io.ReadAll(resp.Body)
})
if err != nil { http.Error(w, "Origin unavailable", 502); return }
if shared { w.Header().Set("X-Coalesced", "true") }
w.Write(result.([]byte))
}
# ❌ BAD — nuclear purge destroys all cached content globally
curl -X POST ".../purge_cache" -d '{"purge_everything": true}'
# ✅ GOOD — only purge mutable content; assets use content-hash filenames
curl -X POST ".../purge_cache" -d '{"tags": ["page-html", "api-catalog"]}'
# ❌ BAD — one TTL for everything
location / { add_header Cache-Control "public, max-age=3600"; }
# ✅ GOOD — match TTL to content lifecycle
location ~* \.(js|css|woff2|png)$ { add_header Cache-Control "public, max-age=31536000, immutable"; }
location ~* \.html$ { add_header Cache-Control "public, max-age=300, stale-while-revalidate=86400"; }
location /api/ { add_header Cache-Control "public, max-age=60, stale-if-error=300"; }
# ❌ BAD — every user gets the same session cookie (session hijacking)
response.headers["Cache-Control"] = "public, max-age=3600"
response.set_cookie("session_id", generate_session())
# ✅ GOOD — user-specific responses must not be cached publicly
response.headers["Cache-Control"] = "private, no-store"
response.set_cookie("session_id", generate_session())
<!-- ❌ BAD — utm params create infinite cache variants -->
<link rel="stylesheet" href="/style.css?v=1">
<!-- ✅ GOOD — content hash in filename; no query string needed -->
<link rel="stylesheet" href="/style.d4e5f6.css">
Vary: * or Vary: Cookie creates a separate entry per unique value, effectively disabling caching. Fix: Strip unnecessary Vary headers; only vary on Accept-Encoding. [src1]stale-while-revalidate + proxy_cache_lock. [src2]Force HTTPS at origin; set Content-Security-Policy: upgrade-insecure-requests. [src2]Include all response-affecting headers in cache key or normalize at edge. [src1]Active cache invalidation via CI/CD pipeline; do not rely solely on TTL. [src4]Configure CDN to forward Access-Control-Allow-Origin from origin. [src2]Versioned filenames for assets; batch HTML invalidations with wildcards. [src3]Use isolate-based compute (Workers, Fastly Compute) for latency-sensitive paths. [src6]# Check cache status of a specific URL
curl -sI https://cdn.example.com/page | grep -iE 'cache-control|x-cache|cf-cache-status|age|vary'
# Test origin response without CDN (bypass cache)
curl -sI https://cdn.example.com/page -H "Cache-Control: no-cache" -H "Pragma: no-cache"
# Verify surrogate keys on origin response
curl -sI https://origin.example.com/products/123 | grep -i surrogate-key
# Check compression is active
curl -sI https://cdn.example.com/style.css -H "Accept-Encoding: br,gzip" | grep -i content-encoding
# Trace CDN routing (which PoP served the request)
curl -sI https://cdn.example.com/ | grep -iE 'cf-ray|x-served-by|x-amz-cf-pop|server-timing'
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Serving static assets to a global audience | All users in same datacenter as origin | Local reverse proxy (Nginx/Varnish) |
| Reducing origin load during traffic spikes | Content is 100% personalized | Application-level caching (Redis) + SSR |
| Improving TTFB for geographically distributed users | Data sovereignty requires content in specific country | Regional deployment with geo-fencing |
| Protecting origin from DDoS via anycast | Real-time websocket or long-polling connections | Direct origin connection |
| Offloading TLS termination and cert management | Content changes on every request (no cacheability) | API gateway with rate limiting |
| Running lightweight edge logic (auth, redirects, A/B) | Complex SSR with database access at the edge | Edge-native DBs (D1, Turso) + full-stack edge framework |