How to build a URL shortening service like Bitly or TinyURL?

- Bottom line: A URL shortener maps long URLs to short 6-7 character codes using base62 encoding over unique IDs, stores mappings in a NoSQL database, and uses multi-layer caching (CDN + Redis) to serve redirects at <50ms latency with a 100:1 read-to-write ratio.

URL shortener system design interview

- Bottom line: A URL shortener maps long URLs to short 6-7 character codes using base62 encoding over unique IDs, stores mappings in a NoSQL database, and uses multi-layer caching (CDN + Redis) to serve redirects at <50ms latency with a 100:1 read-to-write ratio.

Scalable short link service architecture

- Bottom line: A URL shortener maps long URLs to short 6-7 character codes using base62 encoding over unique IDs, stores mappings in a NoSQL database, and uses multi-layer caching (CDN + Redis) to serve redirects at <50ms latency with a 100:1 read-to-write ratio.

Design a URL shortener with billions of links

- Bottom line: A URL shortener maps long URLs to short 6-7 character codes using base62 encoding over unique IDs, stores mappings in a NoSQL database, and uses multi-layer caching (CDN + Redis) to serve redirects at <50ms latency with a 100:1 read-to-write ratio.

URL Shortener System Design at Scale

How do I design a URL shortener at scale?

TL;DR

Bottom line: A URL shortener maps long URLs to short 6-7 character codes using base62 encoding over unique IDs, stores mappings in a NoSQL database, and uses multi-layer caching (CDN + Redis) to serve redirects at <50ms latency with a 100:1 read-to-write ratio.
Key tool/command: base62_encode(unique_id) to generate short codes; Redis for caching hot redirects
Watch out for: Using HTTP 301 (permanent redirect) when you need analytics — browsers cache 301s permanently, so your server never sees repeat clicks.
Works with: Any language/stack. Core components: load balancer, app servers, Redis/Memcached, Cassandra/DynamoDB/PostgreSQL, Kafka for analytics.

Constraints

Short codes must be at least 6-7 characters in base62 to support billions of unique URLs (62^7 = 3.5 trillion combinations)
Read-to-write ratio is approximately 100:1; the redirect (read) path must be optimized first
Never use HTTP 301 (permanent redirect) if you need click analytics — browsers cache 301s and skip your server on subsequent visits
URL validation and malware scanning (e.g., Google Safe Browsing API) must happen at write time, not redirect time, to avoid adding latency to redirects
Cache invalidation must propagate to all edge/cache nodes when a short URL is deleted, updated, or expires

Quick Reference

Component	Role	Technology Options	Scaling Strategy
Load Balancer	Distributes traffic across app servers	AWS ALB, Nginx, HAProxy	Geo-DNS routing + multiple LBs per region
API Gateway	Rate limiting, auth, request routing	Kong, AWS API Gateway	Horizontal scaling behind LB
URL Creation Service	Accepts long URLs, generates short codes	Go, Java, Node.js	Stateless; scale horizontally
Redirect Service	Resolves short code to long URL, returns 302	Go, Rust (low latency)	Separate from write path; scale independently
ID Generator	Produces globally unique IDs for base62 encoding	Snowflake, Zookeeper counter ranges	Pre-allocate ID ranges per node to avoid coordination
Cache Layer	Stores hot short-to-long URL mappings	Redis Cluster, Memcached	LRU eviction; 20% of URLs serve 80% of traffic
Primary Database	Persistent URL mapping storage	Cassandra, DynamoDB, PostgreSQL	Hash-based sharding on short_code
CDN / Edge Cache	Caches redirects at edge locations	Cloudflare, CloudFront	Cache 302 responses with short TTL (5-60 min)
Analytics Pipeline	Captures click events asynchronously	Kafka + Flink/Spark, ClickHouse	Decouple from redirect path; process in batches
Cleanup Service	Removes expired URLs	Cron job / background worker	Runs during low-traffic windows; lazy deletion on access
Abuse Prevention	Rate limiting, malware scanning	Redis (token bucket), Google Safe Browsing	Per-IP and per-API-key rate limits
Monitoring	System health, latency tracking	Prometheus + Grafana, Datadog	Alert on p99 redirect latency > 100ms

Decision Tree

START
|-- Expected write volume?
|   |-- <1K URLs/day (hobby/internal)?
|   |   |-- Use single PostgreSQL instance + in-memory cache
|   |   |-- Auto-increment ID + base62 encode
|   |   +-- Single server is sufficient
|   |-- 1K-100K URLs/day (startup)?
|   |   |-- Use PostgreSQL with read replicas + Redis cache
|   |   |-- Snowflake-style ID generator (single node)
|   |   +-- 2-4 app server instances behind load balancer
|   |-- 100K-10M URLs/day (mid-scale)?
|   |   |-- Use Cassandra/DynamoDB + Redis Cluster
|   |   |-- Distributed ID generator (Zookeeper counter ranges)
|   |   |-- Separate read/write services
|   |   +-- CDN for edge caching of popular redirects
|   +-- >10M URLs/day (Bitly-scale)?
|       |-- Use Cassandra with multi-DC replication + Redis Cluster
|       |-- Pre-generated key pools (KGS) with range allocation
|       |-- Geo-distributed deployment
|       |-- Kafka analytics pipeline with ClickHouse
|       +-- Dedicated abuse prevention and rate limiting layer
|
|-- Need analytics?
|   |-- YES -> Use HTTP 302 + async Kafka pipeline
|   +-- NO  -> Use HTTP 301 for better client-side caching
|
+-- Need custom aliases?
    |-- YES -> Separate alias table; validate uniqueness + reserved words
    +-- NO  -> Auto-generated base62 codes only

Step-by-Step Guide

1. Define capacity requirements

Estimate write and read QPS, storage, and bandwidth based on expected traffic. [src1]

Given: 100M URLs created per month
Write QPS: 100M / (30 * 24 * 3600) = ~40 writes/sec
Read QPS:  40 * 100 = ~4,000 reads/sec (100:1 ratio)
Peak:      10x average = 400 writes/sec, 40,000 reads/sec

Storage per URL: ~500 bytes (short_code + long_url + metadata)
Annual storage:  100M * 12 * 500 bytes = ~600 GB/year
5-year storage:  ~3 TB (before replication)

Cache (80-20 rule): 20% of daily reads * avg URL size
  = 0.2 * (4000 * 86400) * 500 bytes = ~34 GB

Verify: Confirm your peak QPS estimate accounts for viral spikes (10-50x normal traffic).

2. Design the database schema

Choose a key-value-friendly schema optimized for short_code lookups. [src2]

-- Primary URL mappings (or use NoSQL equivalent)
CREATE TABLE url_mappings (
    short_code  VARCHAR(7) PRIMARY KEY,
    long_url    TEXT NOT NULL,
    created_at  TIMESTAMP DEFAULT NOW(),
    expires_at  TIMESTAMP,
    user_id     BIGINT,
    click_count BIGINT DEFAULT 0
);

-- Index for deduplication (optional)
CREATE INDEX idx_long_url ON url_mappings (long_url);

-- Analytics events (or use Kafka -> ClickHouse)
CREATE TABLE click_events (
    event_id    BIGINT PRIMARY KEY,
    short_code  VARCHAR(7),
    clicked_at  TIMESTAMP,
    referrer    TEXT,
    user_agent  TEXT,
    country     VARCHAR(2),
    ip_hash     VARCHAR(64)
);

Verify: Ensure short_code is the primary key / partition key for O(1) lookups.

3. Implement short code generation

Use a unique ID generator + base62 encoding for collision-free codes. [src7]

import string

ALPHABET = string.digits + string.ascii_lowercase + string.ascii_uppercase

def base62_encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num > 0:
        num, remainder = divmod(num, 62)
        result.append(ALPHABET[remainder])
    return ''.join(reversed(result))

# Example: unique_id=123456789 -> short_code="8M0kX"
print(base62_encode(123456789))

Verify: base62_decode(base62_encode(n)) == n for any positive integer n.

4. Set up caching with Redis

Cache frequently accessed URL mappings to serve redirects from memory. [src5]

import redis

r = redis.Redis(host='redis-cluster', port=6379, decode_responses=True)
CACHE_TTL = 3600  # 1 hour

def get_long_url(short_code: str) -> str | None:
    cached = r.get(f"url:{short_code}")
    if cached:
        return cached
    long_url = db_lookup(short_code)
    if long_url:
        r.setex(f"url:{short_code}", CACHE_TTL, long_url)
    return long_url

Verify: Monitor cache hit ratio — target >90%. redis-cli INFO stats | grep hit_rate

5. Build the redirect service

Handle incoming short URL requests and issue HTTP 302 redirects. [src1]

from fastapi import FastAPI, HTTPException
from fastapi.responses import RedirectResponse

app = FastAPI()

@app.get("/{short_code}")
async def redirect(short_code: str):
    long_url = get_long_url(short_code)
    if not long_url:
        raise HTTPException(status_code=404, detail="Not found")
    publish_click_event(short_code, request)
    return RedirectResponse(url=long_url, status_code=302)

Verify: curl -I https://short.url/abc123 returns HTTP/1.1 302 with Location: header.

6. Add analytics pipeline

Decouple analytics from the redirect path using a message queue. [src1]

from kafka import KafkaProducer
import json

producer = KafkaProducer(
    bootstrap_servers='kafka:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)

def publish_click_event(short_code, request):
    event = {
        "short_code": short_code,
        "timestamp": datetime.utcnow().isoformat(),
        "referrer": request.headers.get("referer", ""),
        "user_agent": request.headers.get("user-agent", ""),
        "ip_country": geoip_lookup(request.client.host),
    }
    producer.send("click-events", value=event)

Verify: Check Kafka consumer lag: kafka-consumer-groups.sh --describe --group analytics-consumer

Code Examples

Python: Base62 Encoding with Snowflake ID Generation

# Input:  None (generates unique IDs internally)
# Output: Collision-free 6-7 character short codes

import string, time, threading

ALPHABET = string.digits + string.ascii_lowercase + string.ascii_uppercase

def base62_encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num > 0:
        num, remainder = divmod(num, 62)
        result.append(ALPHABET[remainder])
    return ''.join(reversed(result))

class SnowflakeIDGenerator:
    """64-bit IDs: timestamp(41) + node_id(10) + sequence(12)"""
    def __init__(self, node_id: int):
        self.node_id = node_id & 0x3FF
        self.sequence = 0
        self.last_ts = 0
        self.lock = threading.Lock()
        self.epoch = 1577836800000  # 2020-01-01

    def next_id(self) -> int:
        with self.lock:
            ts = int(time.time() * 1000) - self.epoch
            if ts == self.last_ts:
                self.sequence = (self.sequence + 1) & 0xFFF
                if self.sequence == 0:
                    while ts <= self.last_ts:
                        ts = int(time.time() * 1000) - self.epoch
            else:
                self.sequence = 0
            self.last_ts = ts
            return (ts << 22) | (self.node_id << 12) | self.sequence

gen = SnowflakeIDGenerator(node_id=1)
short_code = base62_encode(gen.next_id())

Node.js: Redirect Service with Redis Caching

// Input:  HTTP GET request with short code in URL path
// Output: HTTP 302 redirect to original long URL

const express = require('express');
const Redis = require('ioredis');
const { Kafka } = require('kafkajs');

const app = express();
const redis = new Redis({ host: 'redis-cluster', port: 6379 });
const kafka = new Kafka({ brokers: ['kafka:9092'] });
const producer = kafka.producer();
const CACHE_TTL = 3600;

app.get('/:shortCode', async (req, res) => {
    const { shortCode } = req.params;
    let longUrl = await redis.get(`url:${shortCode}`);
    if (!longUrl) {
        const row = await db.query(
            'SELECT long_url FROM url_mappings WHERE short_code = $1',
            [shortCode]
        );
        if (!row) return res.status(404).json({ error: 'Not found' });
        longUrl = row.long_url;
        await redis.setex(`url:${shortCode}`, CACHE_TTL, longUrl);
    }
    producer.send({
        topic: 'click-events',
        messages: [{ value: JSON.stringify({
            short_code: shortCode,
            timestamp: new Date().toISOString(),
            referrer: req.headers.referer || '',
            user_agent: req.headers['user-agent'] || '',
        })}],
    }).catch(err => console.error('Kafka error:', err));
    return res.redirect(302, longUrl);
});

app.listen(3000);

Anti-Patterns

Wrong: Using MD5/SHA hash of the URL and truncating

# BAD -- truncated hashes cause collisions at scale
import hashlib, base64

def shorten(long_url):
    md5 = hashlib.md5(long_url.encode()).digest()
    return base64.urlsafe_b64encode(md5[:6]).decode()[:7]
    # At ~10M URLs, collision probability exceeds 1%

Correct: Use a unique ID generator + base62

# GOOD -- unique IDs guarantee zero collisions
def shorten(long_url, id_generator, db):
    unique_id = id_generator.next_id()
    short_code = base62_encode(unique_id)
    db.insert(short_code, long_url)
    return short_code

Wrong: Checking the database on every redirect

# BAD -- every redirect hits the database
@app.get("/{code}")
async def redirect(code: str):
    row = await db.query("SELECT long_url FROM urls WHERE code = $1", [code])
    return RedirectResponse(url=row.long_url, status_code=302)
    # At 40K reads/sec, the database becomes the bottleneck

Correct: Cache-aside pattern with Redis

# GOOD -- cache handles >90% of reads
@app.get("/{code}")
async def redirect(code: str):
    long_url = await redis.get(f"url:{code}")
    if not long_url:
        row = await db.query("...", [code])
        if row:
            await redis.setex(f"url:{code}", 3600, row.long_url)
        long_url = row.long_url if row else None
    return RedirectResponse(url=long_url, status_code=302)

Wrong: Using HTTP 301 when you need analytics

# BAD -- 301 means browser caches the redirect permanently
return RedirectResponse(url=long_url, status_code=301)
# Browser goes directly to long URL on subsequent clicks
# Your analytics data is incomplete and misleading

Correct: Use HTTP 302 for trackable redirects

# GOOD -- 302 means browser asks your server every time
return RedirectResponse(url=long_url, status_code=302)
# Every click passes through your server for full analytics
# Offset higher load with CDN caching (short TTL)

Wrong: Synchronous analytics in the redirect path

# BAD -- analytics write blocks the redirect response
@app.get("/{code}")
async def redirect(code: str):
    long_url = get_long_url(code)
    await db.execute("INSERT INTO clicks ...", [code, datetime.now()])
    await db.execute("UPDATE urls SET click_count = click_count + 1 ...", [code])
    return RedirectResponse(url=long_url, status_code=302)

Correct: Async analytics via message queue

# GOOD -- fire-and-forget to Kafka, redirect returns immediately
@app.get("/{code}")
async def redirect(code: str):
    long_url = get_long_url(code)
    kafka_producer.send("click-events", {"code": code, "ts": time.time()})
    return RedirectResponse(url=long_url, status_code=302)

Common Pitfalls

Hash collision at scale: MD5/SHA truncation causes collisions once you exceed millions of URLs. Fix: use a unique ID generator (Snowflake, Zookeeper counter ranges) + base62 encoding. [src2]
Single-point-of-failure ID generator: A single auto-increment counter becomes a bottleneck and SPOF. Fix: distributed ID generation with pre-allocated ranges via Zookeeper. [src1]
Cache stampede on popular URLs: When a cache entry expires, thousands of simultaneous requests hit the database. Fix: staggered TTLs (TTL + random jitter) or single-flight/request-coalescing pattern. [src5]
No expiration/cleanup strategy: Orphaned URLs consume storage indefinitely. Fix: set default expiration (2 years), run cleanup during off-peak, and implement lazy deletion on access. [src3]
Ignoring abuse vectors: Short URLs mask phishing/malware destinations. Fix: validate with Google Safe Browsing API at creation time; per-IP and per-API-key rate limiting. [src6]
Predictable short codes: Sequential base62 codes let attackers enumerate all URLs. Fix: reversible bit-shuffle or XOR cipher on the ID before encoding. [src1]
Not separating read and write paths: A redirect traffic spike degrades URL creation. Fix: separate microservices for creation vs. redirect with independent scaling. [src4]
302 redirect without CDN caching: Every redirect hits origin servers. Fix: Cache-Control: public, max-age=300 on 302 responses for CDN edge caching. [src1]

Diagnostic Commands

# Check Redis cache hit rate
redis-cli INFO stats | grep -E "keyspace_hits|keyspace_misses"

# Monitor redirect latency (p50, p95, p99)
curl -w "time_total: %{time_total}s\n" -o /dev/null -s https://short.url/abc123

# Check Kafka consumer lag for analytics pipeline
kafka-consumer-groups.sh --bootstrap-server kafka:9092 \
  --describe --group analytics-consumer

# Verify database connection pool usage
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';

# Check URL mapping count and storage size
SELECT count(*) AS total_urls,
       pg_size_pretty(pg_total_relation_size('url_mappings')) AS storage
FROM url_mappings;

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Sharing long URLs in space-constrained contexts (SMS, tweets, QR codes)	URLs are already short or internal-only	Direct links with no shortening
You need click analytics (geo, referrer, device, time)	Privacy regulations prohibit redirect tracking	Direct links with server-side analytics
Marketing campaigns requiring branded short domains	You need content hosting/sharing (not just redirects)	Pastebin or object storage service
QR code generation (shorter URLs = simpler QR patterns)	URL mappings change frequently (write-heavy)	Feature flag service or API gateway routing
A/B testing via dynamic redirect targets	You only need vanity domains without analytics	DNS CNAME record or reverse proxy

Important Caveats

Base62 encoding of sequential IDs produces predictable short codes; attackers can enumerate URLs by incrementing. Apply a reversible bit-shuffle, XOR cipher, or use pre-generated random key pools to mitigate.
HTTP 301 vs 302 is the single most consequential architectural decision: 301 reduces server load but makes analytics impossible; 302 enables full analytics but increases origin traffic. CDN edge caching with short TTLs (5-60 min) is the standard compromise.
At Bitly-scale (billions of redirects/day), the analytics pipeline (Kafka -> Flink/Spark -> ClickHouse) often costs more to operate than the redirect infrastructure itself. Budget accordingly.
URL shorteners are frequently abused for phishing. Integrating Google Safe Browsing or similar malware detection at URL creation time is not optional for any public-facing service.
Database choice matters less than caching strategy. PostgreSQL, Cassandra, and DynamoDB all work at scale when 90%+ of reads are served from Redis/Memcached.