How to Design an API Gateway

Type: Software Reference Confidence: 0.93 Sources: 7 Verified: 2026-02-23 Freshness: quarterly

TL;DR

Constraints

Quick Reference

ComponentRoleTechnology OptionsScaling Strategy
Edge Load BalancerDistributes traffic across gateway instancesAWS ALB/NLB, HAProxy, CloudflareHorizontal — add LB nodes or use managed
API GatewayRoutes requests, enforces policiesKong, NGINX, Envoy, AWS API Gateway, ApigeeHorizontal — stateless instances behind LB
Auth LayerValidates JWT/OAuth2 tokens, API keysGateway plugin (Kong JWT, NGINX auth_request), Keycloak, Auth0Offload to external IdP; cache tokens at gateway
Rate LimiterThrottles requests per consumer/IPGateway plugin (Kong rate-limiting), Redis-backed sliding windowShared Redis cluster for distributed counting
Circuit BreakerPrevents cascading failures to unhealthy servicesEnvoy outlier detection, Kong circuit-breaker, HystrixPer-service thresholds; half-open probes
Request RouterMaps URL paths/headers to backend servicesPath-based routing, header-based, weighted canaryConfig reload (NGINX) or dynamic API (Kong)
Response CacheCaches repeated GET responsesGateway cache plugin, Varnish, CDN edge cacheTTL-based eviction; cache invalidation on writes
Protocol TranslatorConverts between REST, gRPC, WebSocketEnvoy gRPC-JSON transcoding, Kong gRPC-gatewayStateless — scales with gateway instances
Observability CollectorAggregates logs, metrics, tracesOpenTelemetry, Prometheus, Jaeger, DatadogSidecar or gateway plugin; async export
Service RegistryDiscovers backend service addressesConsul, Kubernetes DNS, AWS Cloud Map, etcdEventual consistency; health check intervals
Config StoreStores gateway routing rules and policiesKong DB (Postgres), etcd, Kubernetes CRDs, S3DB replication or declarative GitOps
TLS TerminatorHandles HTTPS certificates and handshakesLet's Encrypt + certbot, AWS ACM, NGINX ssl moduleOffload to LB or gateway edge; auto-renewal

Decision Tree

START
├── Need serverless / minimal ops?
│   ├── YES → Managed gateway (AWS API Gateway, GCP Apigee, Azure API Management)
│   └── NO ↓
├── Running Kubernetes?
│   ├── YES → Kubernetes-native gateway ↓
│   │   ├── Need service mesh integration (Istio)?
│   │   │   ├── YES → Envoy Gateway / Istio Ingress Gateway
│   │   │   └── NO → Kong Ingress Controller or NGINX Ingress
│   └── NO ↓
├── Need plugin ecosystem + GUI admin?
│   ├── YES → Kong Gateway (open-source or Enterprise)
│   └── NO ↓
├── Need max raw performance + minimal footprint?
│   ├── YES → NGINX (config-file based, battle-tested)
│   └── NO ↓
├── Need gRPC-native + advanced observability?
│   ├── YES → Envoy Proxy (xDS API for dynamic config)
│   └── NO ↓
└── DEFAULT → Kong Gateway (best balance of features, ecosystem, and community support)

Managed vs Self-Hosted Comparison

FeatureManaged (AWS/GCP/Azure)Self-Hosted (Kong/NGINX/Envoy)
Setup timeMinutesHours to days
Operational overheadNone (vendor-managed)High (patching, scaling, monitoring)
CustomizationLimited to vendor featuresFull control over plugins and config
LatencyAdds ~5-20ms (network hop)~1-5ms (co-located)
Cost at low volumePay-per-request ($3.50/M calls on AWS)Fixed infra cost (potentially cheaper)
Cost at high volumeExpensive (>$1K/M requests)Cheaper at scale (commodity infra)
Vendor lock-inHighNone
Multi-cloudVendor-specificPortable across any infrastructure
gRPC/WebSocket supportVaries by vendorFull native support
Plugin ecosystemVendor extensions onlyKong: 100+ plugins; Envoy: filters; NGINX: modules

Step-by-Step Guide

1. Define your service topology

Map all backend services, their protocols, and traffic patterns. Identify which cross-cutting concerns (auth, rate limiting, logging) are currently duplicated across services. [src1]

# service-topology.yaml — document before building
services:
  - name: user-service
    protocol: REST
    port: 8001
    health_check: /health
    avg_rps: 500
    auth: JWT
  - name: order-service
    protocol: REST
    port: 8002
    health_check: /health
    avg_rps: 2000
    auth: JWT
  - name: payment-service
    protocol: gRPC
    port: 50051
    health_check: grpc_health_v1
    avg_rps: 300
    auth: mTLS

Verify: Count total services and protocols — if >3 services or >1 protocol, an API gateway is justified.

2. Choose your gateway technology

Use the decision tree above to select managed vs self-hosted, then pick a specific implementation. [src2]

# For Kong (Docker quickstart)
docker network create kong-net
docker run -d --name kong-database \
  --network=kong-net \
  -e "POSTGRES_USER=kong" \
  -e "POSTGRES_DB=kong" \
  -e "POSTGRES_PASSWORD=kongpass" \
  postgres:16-alpine

docker run --rm --network=kong-net \
  -e "KONG_DATABASE=postgres" \
  -e "KONG_PG_HOST=kong-database" \
  -e "KONG_PG_PASSWORD=kongpass" \
  kong/kong-gateway:3.9 kong migrations bootstrap

docker run -d --name kong-gateway \
  --network=kong-net \
  -e "KONG_DATABASE=postgres" \
  -e "KONG_PG_HOST=kong-database" \
  -e "KONG_PG_PASSWORD=kongpass" \
  -e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
  -e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
  -e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
  -e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
  -e "KONG_ADMIN_LISTEN=0.0.0.0:8001" \
  -p 8000:8000 \
  -p 8001:8001 \
  kong/kong-gateway:3.9

Verify: curl -s http://localhost:8001/ | jq '.version' → expected: "3.9.x"

3. Configure service routing

Register backend services and define routes that map incoming request paths to upstream services. [src4]

# Kong Admin API — register services and routes
curl -i -X POST http://localhost:8001/services/ \
  --data name=user-service \
  --data url=http://user-service:8001

curl -i -X POST http://localhost:8001/services/user-service/routes \
  --data 'paths[]=/api/v1/users' \
  --data name=user-route

Verify: curl -s http://localhost:8001/routes | jq '.data[].name' → expected: "user-route"

4. Add authentication

Enable JWT or OAuth2 validation at the gateway so backend services receive pre-validated tokens. [src4]

# Enable JWT plugin globally on Kong
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=jwt

# Create a consumer and JWT credential
curl -i -X POST http://localhost:8001/consumers/ \
  --data username=mobile-app

curl -i -X POST http://localhost:8001/consumers/mobile-app/jwt \
  --data algorithm=HS256 \
  --data secret=your-256-bit-secret

Verify: curl -s http://localhost:8000/api/v1/users → expected: 401 Unauthorized

5. Add rate limiting

Configure per-consumer and global rate limits to prevent abuse and ensure fair resource allocation. [src4]

# Kong rate-limiting plugin — per consumer, 100 req/min
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=rate-limiting \
  --data config.minute=100 \
  --data config.policy=redis \
  --data config.redis.host=redis \
  --data config.redis.port=6379 \
  --data config.limit_by=consumer

Verify: Send 101 requests in 60 seconds → expected: 101st request returns 429 Too Many Requests

6. Enable observability

Configure logging, metrics, and distributed tracing to monitor gateway health and debug issues. [src3]

# Kong — enable Prometheus metrics plugin
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=prometheus \
  --data config.per_consumer=true

# Kong — enable correlation ID for distributed tracing
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=correlation-id \
  --data config.header_name=X-Request-ID \
  --data config.generator=uuid

Verify: curl -s http://localhost:8001/metrics → expected: Prometheus-format metrics including kong_http_requests_total

7. Deploy for high availability

Run multiple gateway instances behind a load balancer with health checks. [src7]

# docker-compose.yml — HA gateway setup (excerpt)
services:
  kong-1:
    image: kong/kong-gateway:3.9
    environment:
      KONG_DATABASE: "off"
      KONG_DECLARATIVE_CONFIG: /etc/kong/kong.yml
    healthcheck:
      test: ["CMD", "kong", "health"]
      interval: 10s
  kong-2:
    image: kong/kong-gateway:3.9
    environment:
      KONG_DATABASE: "off"
      KONG_DECLARATIVE_CONFIG: /etc/kong/kong.yml
    healthcheck:
      test: ["CMD", "kong", "health"]
      interval: 10s
  load-balancer:
    image: nginx:1.27-alpine
    ports:
      - "80:80"
    depends_on:
      kong-1: { condition: service_healthy }
      kong-2: { condition: service_healthy }

Verify: Stop one Kong instance → requests continue flowing through the other instance with no client-visible errors.

Code Examples

Kong: Declarative Configuration (DB-less mode)

# kong.yml — declarative gateway config
_format_version: "3.0"
services:
  - name: user-service
    url: http://user-service:8001
    routes:
      - name: user-route
        paths: [/api/v1/users]
        strip_path: false
    plugins:
      - name: rate-limiting
        config: { minute: 100, policy: local }
plugins:
  - name: jwt
    config: { claims_to_verify: [exp] }
  - name: correlation-id
    config: { header_name: X-Request-ID, generator: uuid }

NGINX: Reverse Proxy API Gateway

# nginx-gateway.conf
upstream user_service {
    server user-service-1:8001 weight=5;
    server user-service-2:8001 weight=5;
    keepalive 32;
}
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/m;
server {
    listen 443 ssl http2;
    ssl_certificate /etc/ssl/certs/api.crt;
    ssl_certificate_key /etc/ssl/private/api.key;
    limit_req zone=api_limit burst=20 nodelay;
    location /api/v1/users {
        auth_request /auth;
        proxy_pass http://user_service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Node.js: Custom API Gateway with Express

Full script: gateway.js (65 lines)

// gateway.js — Custom API gateway with Express
const express = require("express"); // ^4.21.0
const { createProxyMiddleware } = require("http-proxy-middleware"); // ^3.0.0
const rateLimit = require("express-rate-limit"); // ^7.5.0
const jwt = require("jsonwebtoken"); // ^9.0.0
const app = express();

// Global rate limiter: 100 req/min per IP
app.use(rateLimit({ windowMs: 60000, max: 100, standardHeaders: true }));

// JWT auth middleware
function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(" ")[1];
  if (!token) return res.status(401).json({ error: "Missing token" });
  try { req.user = jwt.verify(token, process.env.JWT_SECRET); next(); }
  catch { return res.status(401).json({ error: "Invalid token" }); }
}

// Route to backend services
app.use("/api/v1/users", authenticate, createProxyMiddleware({
  target: "http://user-service:8001", changeOrigin: true
}));
app.listen(3000);

Anti-Patterns

Wrong: Embedding business logic in the gateway

// BAD — gateway performs order validation and pricing
app.post("/api/v1/orders", authenticate, async (req, res) => {
  const total = req.body.items.reduce((s, i) => s + i.price * i.qty, 0);
  if (total > req.user.creditLimit) return res.status(400).json({ error: "Exceeds limit" });
  const order = await db.orders.create({ items: req.body.items, total });
  res.json(order);
});

Correct: Gateway only routes; services own logic

// GOOD — gateway proxies to order-service which handles all business logic
app.use("/api/v1/orders", authenticate, createProxyMiddleware({
  target: "http://order-service:8002", changeOrigin: true,
}));

Wrong: No circuit breaker — cascading failures

# BAD — single upstream, no failure handling
location /api/v1/payments {
    proxy_pass http://payment-service:8003;
    # No timeout, no retry — requests hang if service is down
}

Correct: Circuit breaker with health checks and timeouts

# GOOD — upstream health checks, timeouts, failover
upstream payment_service {
    server payment-1:8003 max_fails=3 fail_timeout=30s;
    server payment-2:8003 max_fails=3 fail_timeout=30s;
    server payment-fallback:8003 backup;
}
location /api/v1/payments {
    proxy_pass http://payment_service;
    proxy_connect_timeout 5s;
    proxy_read_timeout 10s;
    proxy_next_upstream error timeout http_502 http_503;
}

Wrong: Inconsistent auth — some services unprotected

// BAD — auth enforced per-route, easy to forget new routes
app.use("/api/v1/users", authenticate, userProxy);
app.use("/api/v1/orders", authenticate, orderProxy);
app.use("/api/v1/reports", reportProxy); // No auth!
app.use("/api/v1/admin", adminProxy);    // No auth!

Correct: Global auth with explicit public exceptions

// GOOD — auth is the default; whitelist public routes
const PUBLIC = ["/health", "/api/v1/public", "/docs"];
app.use((req, res, next) => {
  if (PUBLIC.some(p => req.path.startsWith(p))) return next();
  authenticate(req, res, next);
});
app.use("/api/v1/users", userProxy);     // Automatically protected
app.use("/api/v1/admin", adminProxy);    // Automatically protected

Wrong: Single gateway instance with no redundancy

# BAD — single SPOF for all API traffic
services:
  gateway:
    image: kong/kong-gateway:3.9
    ports: ["80:8000"]
    # No replicas, no health checks, no load balancer

Correct: Multiple instances behind a load balancer

# GOOD — replicated gateway with health checks
services:
  gateway:
    image: kong/kong-gateway:3.9
    deploy:
      replicas: 3
      restart_policy: { condition: on-failure }
    healthcheck:
      test: ["CMD", "kong", "health"]
      interval: 10s

Common Pitfalls

Diagnostic Commands

# Check Kong gateway health and version
curl -s http://localhost:8001/ | jq '{version, hostname, node_id}'

# List all registered services and routes
curl -s http://localhost:8001/services | jq '.data[] | {name, host, port}'
curl -s http://localhost:8001/routes | jq '.data[] | {name, paths, service}'

# Check active plugins
curl -s http://localhost:8001/plugins | jq '.data[] | {name, enabled, service}'

# Test NGINX config syntax
nginx -t

# Reload NGINX config without downtime
nginx -s reload

# Monitor gateway latency (Kong Prometheus)
curl -s http://localhost:8001/metrics | grep kong_latency

# Verify rate limiting headers
curl -v http://localhost:8000/api/v1/users 2>&1 | grep -i ratelimit

# Test circuit breaker behavior
for i in $(seq 1 10); do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8000/api/v1/payments; done

Version History & Compatibility

GatewayCurrent VersionStatusKey Changes
Kong Gateway3.9.x (Nov 2025)CurrentDynamic app route config, AI gateway plugins
Kong Gateway3.4.x LTSLTS until Dec 2026Stable; security patches only
NGINX1.27.x (2025)CurrentHTTP/3 stable, QUIC support
Envoy1.32.x (2025)CurrentWASM plugin support, ext_authz v3
AWS API Gatewayv2 (HTTP API)CurrentJWT authorizer built-in, 50% cheaper than v1
AWS API Gatewayv1 (REST API)MaintainedFull feature set, usage plans, API keys

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
3+ microservices need unified auth and rate limitingSingle monolith with one backendSimple reverse proxy (NGINX/Caddy)
Multiple client types (web, mobile, IoT) need different APIsAll clients consume the same API shapeDirect service calls with shared auth library
Need protocol translation (REST to gRPC, GraphQL federation)All services use the same protocol internallyService mesh sidecar (Envoy/Linkerd)
Regulatory requirement to log all API access centrallyInternal service-to-service (east-west) traffic onlyService mesh (Istio/Linkerd) for internal traffic
Need canary deployments and traffic splittingSimple blue-green deploymentsLoad balancer with health checks
API monetization — need usage metering per consumerOpen internal APIs with no metering needsDirect exposure behind WAF

Important Caveats

Related Units