URL Shortener System Design at Scale

Type: Software Reference Confidence: 0.93 Sources: 7 Verified: 2026-02-23 Freshness: 2026-02-23

TL;DR

Constraints

Quick Reference

ComponentRoleTechnology OptionsScaling Strategy
Load BalancerDistributes traffic across app serversAWS ALB, Nginx, HAProxyGeo-DNS routing + multiple LBs per region
API GatewayRate limiting, auth, request routingKong, AWS API GatewayHorizontal scaling behind LB
URL Creation ServiceAccepts long URLs, generates short codesGo, Java, Node.jsStateless; scale horizontally
Redirect ServiceResolves short code to long URL, returns 302Go, Rust (low latency)Separate from write path; scale independently
ID GeneratorProduces globally unique IDs for base62 encodingSnowflake, Zookeeper counter rangesPre-allocate ID ranges per node to avoid coordination
Cache LayerStores hot short-to-long URL mappingsRedis Cluster, MemcachedLRU eviction; 20% of URLs serve 80% of traffic
Primary DatabasePersistent URL mapping storageCassandra, DynamoDB, PostgreSQLHash-based sharding on short_code
CDN / Edge CacheCaches redirects at edge locationsCloudflare, CloudFrontCache 302 responses with short TTL (5-60 min)
Analytics PipelineCaptures click events asynchronouslyKafka + Flink/Spark, ClickHouseDecouple from redirect path; process in batches
Cleanup ServiceRemoves expired URLsCron job / background workerRuns during low-traffic windows; lazy deletion on access
Abuse PreventionRate limiting, malware scanningRedis (token bucket), Google Safe BrowsingPer-IP and per-API-key rate limits
MonitoringSystem health, latency trackingPrometheus + Grafana, DatadogAlert on p99 redirect latency > 100ms

Decision Tree

START
|-- Expected write volume?
|   |-- <1K URLs/day (hobby/internal)?
|   |   |-- Use single PostgreSQL instance + in-memory cache
|   |   |-- Auto-increment ID + base62 encode
|   |   +-- Single server is sufficient
|   |-- 1K-100K URLs/day (startup)?
|   |   |-- Use PostgreSQL with read replicas + Redis cache
|   |   |-- Snowflake-style ID generator (single node)
|   |   +-- 2-4 app server instances behind load balancer
|   |-- 100K-10M URLs/day (mid-scale)?
|   |   |-- Use Cassandra/DynamoDB + Redis Cluster
|   |   |-- Distributed ID generator (Zookeeper counter ranges)
|   |   |-- Separate read/write services
|   |   +-- CDN for edge caching of popular redirects
|   +-- >10M URLs/day (Bitly-scale)?
|       |-- Use Cassandra with multi-DC replication + Redis Cluster
|       |-- Pre-generated key pools (KGS) with range allocation
|       |-- Geo-distributed deployment
|       |-- Kafka analytics pipeline with ClickHouse
|       +-- Dedicated abuse prevention and rate limiting layer
|
|-- Need analytics?
|   |-- YES -> Use HTTP 302 + async Kafka pipeline
|   +-- NO  -> Use HTTP 301 for better client-side caching
|
+-- Need custom aliases?
    |-- YES -> Separate alias table; validate uniqueness + reserved words
    +-- NO  -> Auto-generated base62 codes only

Step-by-Step Guide

1. Define capacity requirements

Estimate write and read QPS, storage, and bandwidth based on expected traffic. [src1]

Given: 100M URLs created per month
Write QPS: 100M / (30 * 24 * 3600) = ~40 writes/sec
Read QPS:  40 * 100 = ~4,000 reads/sec (100:1 ratio)
Peak:      10x average = 400 writes/sec, 40,000 reads/sec

Storage per URL: ~500 bytes (short_code + long_url + metadata)
Annual storage:  100M * 12 * 500 bytes = ~600 GB/year
5-year storage:  ~3 TB (before replication)

Cache (80-20 rule): 20% of daily reads * avg URL size
  = 0.2 * (4000 * 86400) * 500 bytes = ~34 GB

Verify: Confirm your peak QPS estimate accounts for viral spikes (10-50x normal traffic).

2. Design the database schema

Choose a key-value-friendly schema optimized for short_code lookups. [src2]

-- Primary URL mappings (or use NoSQL equivalent)
CREATE TABLE url_mappings (
    short_code  VARCHAR(7) PRIMARY KEY,
    long_url    TEXT NOT NULL,
    created_at  TIMESTAMP DEFAULT NOW(),
    expires_at  TIMESTAMP,
    user_id     BIGINT,
    click_count BIGINT DEFAULT 0
);

-- Index for deduplication (optional)
CREATE INDEX idx_long_url ON url_mappings (long_url);

-- Analytics events (or use Kafka -> ClickHouse)
CREATE TABLE click_events (
    event_id    BIGINT PRIMARY KEY,
    short_code  VARCHAR(7),
    clicked_at  TIMESTAMP,
    referrer    TEXT,
    user_agent  TEXT,
    country     VARCHAR(2),
    ip_hash     VARCHAR(64)
);

Verify: Ensure short_code is the primary key / partition key for O(1) lookups.

3. Implement short code generation

Use a unique ID generator + base62 encoding for collision-free codes. [src7]

import string

ALPHABET = string.digits + string.ascii_lowercase + string.ascii_uppercase

def base62_encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num > 0:
        num, remainder = divmod(num, 62)
        result.append(ALPHABET[remainder])
    return ''.join(reversed(result))

# Example: unique_id=123456789 -> short_code="8M0kX"
print(base62_encode(123456789))

Verify: base62_decode(base62_encode(n)) == n for any positive integer n.

4. Set up caching with Redis

Cache frequently accessed URL mappings to serve redirects from memory. [src5]

import redis

r = redis.Redis(host='redis-cluster', port=6379, decode_responses=True)
CACHE_TTL = 3600  # 1 hour

def get_long_url(short_code: str) -> str | None:
    cached = r.get(f"url:{short_code}")
    if cached:
        return cached
    long_url = db_lookup(short_code)
    if long_url:
        r.setex(f"url:{short_code}", CACHE_TTL, long_url)
    return long_url

Verify: Monitor cache hit ratio — target >90%. redis-cli INFO stats | grep hit_rate

5. Build the redirect service

Handle incoming short URL requests and issue HTTP 302 redirects. [src1]

from fastapi import FastAPI, HTTPException
from fastapi.responses import RedirectResponse

app = FastAPI()

@app.get("/{short_code}")
async def redirect(short_code: str):
    long_url = get_long_url(short_code)
    if not long_url:
        raise HTTPException(status_code=404, detail="Not found")
    publish_click_event(short_code, request)
    return RedirectResponse(url=long_url, status_code=302)

Verify: curl -I https://short.url/abc123 returns HTTP/1.1 302 with Location: header.

6. Add analytics pipeline

Decouple analytics from the redirect path using a message queue. [src1]

from kafka import KafkaProducer
import json

producer = KafkaProducer(
    bootstrap_servers='kafka:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)

def publish_click_event(short_code, request):
    event = {
        "short_code": short_code,
        "timestamp": datetime.utcnow().isoformat(),
        "referrer": request.headers.get("referer", ""),
        "user_agent": request.headers.get("user-agent", ""),
        "ip_country": geoip_lookup(request.client.host),
    }
    producer.send("click-events", value=event)

Verify: Check Kafka consumer lag: kafka-consumer-groups.sh --describe --group analytics-consumer

Code Examples

Python: Base62 Encoding with Snowflake ID Generation

# Input:  None (generates unique IDs internally)
# Output: Collision-free 6-7 character short codes

import string, time, threading

ALPHABET = string.digits + string.ascii_lowercase + string.ascii_uppercase

def base62_encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num > 0:
        num, remainder = divmod(num, 62)
        result.append(ALPHABET[remainder])
    return ''.join(reversed(result))

class SnowflakeIDGenerator:
    """64-bit IDs: timestamp(41) + node_id(10) + sequence(12)"""
    def __init__(self, node_id: int):
        self.node_id = node_id & 0x3FF
        self.sequence = 0
        self.last_ts = 0
        self.lock = threading.Lock()
        self.epoch = 1577836800000  # 2020-01-01

    def next_id(self) -> int:
        with self.lock:
            ts = int(time.time() * 1000) - self.epoch
            if ts == self.last_ts:
                self.sequence = (self.sequence + 1) & 0xFFF
                if self.sequence == 0:
                    while ts <= self.last_ts:
                        ts = int(time.time() * 1000) - self.epoch
            else:
                self.sequence = 0
            self.last_ts = ts
            return (ts << 22) | (self.node_id << 12) | self.sequence

gen = SnowflakeIDGenerator(node_id=1)
short_code = base62_encode(gen.next_id())

Node.js: Redirect Service with Redis Caching

// Input:  HTTP GET request with short code in URL path
// Output: HTTP 302 redirect to original long URL

const express = require('express');
const Redis = require('ioredis');
const { Kafka } = require('kafkajs');

const app = express();
const redis = new Redis({ host: 'redis-cluster', port: 6379 });
const kafka = new Kafka({ brokers: ['kafka:9092'] });
const producer = kafka.producer();
const CACHE_TTL = 3600;

app.get('/:shortCode', async (req, res) => {
    const { shortCode } = req.params;
    let longUrl = await redis.get(`url:${shortCode}`);
    if (!longUrl) {
        const row = await db.query(
            'SELECT long_url FROM url_mappings WHERE short_code = $1',
            [shortCode]
        );
        if (!row) return res.status(404).json({ error: 'Not found' });
        longUrl = row.long_url;
        await redis.setex(`url:${shortCode}`, CACHE_TTL, longUrl);
    }
    producer.send({
        topic: 'click-events',
        messages: [{ value: JSON.stringify({
            short_code: shortCode,
            timestamp: new Date().toISOString(),
            referrer: req.headers.referer || '',
            user_agent: req.headers['user-agent'] || '',
        })}],
    }).catch(err => console.error('Kafka error:', err));
    return res.redirect(302, longUrl);
});

app.listen(3000);

Anti-Patterns

Wrong: Using MD5/SHA hash of the URL and truncating

# BAD -- truncated hashes cause collisions at scale
import hashlib, base64

def shorten(long_url):
    md5 = hashlib.md5(long_url.encode()).digest()
    return base64.urlsafe_b64encode(md5[:6]).decode()[:7]
    # At ~10M URLs, collision probability exceeds 1%

Correct: Use a unique ID generator + base62

# GOOD -- unique IDs guarantee zero collisions
def shorten(long_url, id_generator, db):
    unique_id = id_generator.next_id()
    short_code = base62_encode(unique_id)
    db.insert(short_code, long_url)
    return short_code

Wrong: Checking the database on every redirect

# BAD -- every redirect hits the database
@app.get("/{code}")
async def redirect(code: str):
    row = await db.query("SELECT long_url FROM urls WHERE code = $1", [code])
    return RedirectResponse(url=row.long_url, status_code=302)
    # At 40K reads/sec, the database becomes the bottleneck

Correct: Cache-aside pattern with Redis

# GOOD -- cache handles >90% of reads
@app.get("/{code}")
async def redirect(code: str):
    long_url = await redis.get(f"url:{code}")
    if not long_url:
        row = await db.query("...", [code])
        if row:
            await redis.setex(f"url:{code}", 3600, row.long_url)
        long_url = row.long_url if row else None
    return RedirectResponse(url=long_url, status_code=302)

Wrong: Using HTTP 301 when you need analytics

# BAD -- 301 means browser caches the redirect permanently
return RedirectResponse(url=long_url, status_code=301)
# Browser goes directly to long URL on subsequent clicks
# Your analytics data is incomplete and misleading

Correct: Use HTTP 302 for trackable redirects

# GOOD -- 302 means browser asks your server every time
return RedirectResponse(url=long_url, status_code=302)
# Every click passes through your server for full analytics
# Offset higher load with CDN caching (short TTL)

Wrong: Synchronous analytics in the redirect path

# BAD -- analytics write blocks the redirect response
@app.get("/{code}")
async def redirect(code: str):
    long_url = get_long_url(code)
    await db.execute("INSERT INTO clicks ...", [code, datetime.now()])
    await db.execute("UPDATE urls SET click_count = click_count + 1 ...", [code])
    return RedirectResponse(url=long_url, status_code=302)

Correct: Async analytics via message queue

# GOOD -- fire-and-forget to Kafka, redirect returns immediately
@app.get("/{code}")
async def redirect(code: str):
    long_url = get_long_url(code)
    kafka_producer.send("click-events", {"code": code, "ts": time.time()})
    return RedirectResponse(url=long_url, status_code=302)

Common Pitfalls

Diagnostic Commands

# Check Redis cache hit rate
redis-cli INFO stats | grep -E "keyspace_hits|keyspace_misses"

# Monitor redirect latency (p50, p95, p99)
curl -w "time_total: %{time_total}s\n" -o /dev/null -s https://short.url/abc123

# Check Kafka consumer lag for analytics pipeline
kafka-consumer-groups.sh --bootstrap-server kafka:9092 \
  --describe --group analytics-consumer

# Verify database connection pool usage
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';

# Check URL mapping count and storage size
SELECT count(*) AS total_urls,
       pg_size_pretty(pg_total_relation_size('url_mappings')) AS storage
FROM url_mappings;

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Sharing long URLs in space-constrained contexts (SMS, tweets, QR codes)URLs are already short or internal-onlyDirect links with no shortening
You need click analytics (geo, referrer, device, time)Privacy regulations prohibit redirect trackingDirect links with server-side analytics
Marketing campaigns requiring branded short domainsYou need content hosting/sharing (not just redirects)Pastebin or object storage service
QR code generation (shorter URLs = simpler QR patterns)URL mappings change frequently (write-heavy)Feature flag service or API gateway routing
A/B testing via dynamic redirect targetsYou only need vanity domains without analyticsDNS CNAME record or reverse proxy

Important Caveats

Related Units