API gateway architecture design

- Bottom line: An API gateway is a reverse proxy that sits between clients and backend microservices, centralizing cross-cutting concerns (authentication, rate limiting, routing, observability) into a single entry point — choose managed (AWS API Gateway, GCP Apigee) for fast time-to-market, or self-hosted (Kong, NGINX, Envoy) for full control at scale.

build an API gateway for microservices

- Bottom line: An API gateway is a reverse proxy that sits between clients and backend microservices, centralizing cross-cutting concerns (authentication, rate limiting, routing, observability) into a single entry point — choose managed (AWS API Gateway, GCP Apigee) for fast time-to-market, or self-hosted (Kong, NGINX, Envoy) for full control at scale.

API gateway pattern implementation

- Bottom line: An API gateway is a reverse proxy that sits between clients and backend microservices, centralizing cross-cutting concerns (authentication, rate limiting, routing, observability) into a single entry point — choose managed (AWS API Gateway, GCP Apigee) for fast time-to-market, or self-hosted (Kong, NGINX, Envoy) for full control at scale.

reverse proxy vs API gateway

- Bottom line: An API gateway is a reverse proxy that sits between clients and backend microservices, centralizing cross-cutting concerns (authentication, rate limiting, routing, observability) into a single entry point — choose managed (AWS API Gateway, GCP Apigee) for fast time-to-market, or self-hosted (Kong, NGINX, Envoy) for full control at scale.

Kong vs NGINX vs Envoy gateway comparison

- Bottom line: An API gateway is a reverse proxy that sits between clients and backend microservices, centralizing cross-cutting concerns (authentication, rate limiting, routing, observability) into a single entry point — choose managed (AWS API Gateway, GCP Apigee) for fast time-to-market, or self-hosted (Kong, NGINX, Envoy) for full control at scale.

How to Design an API Gateway

How do I design an API gateway?

TL;DR

Bottom line: An API gateway is a reverse proxy that sits between clients and backend microservices, centralizing cross-cutting concerns (authentication, rate limiting, routing, observability) into a single entry point — choose managed (AWS API Gateway, GCP Apigee) for fast time-to-market, or self-hosted (Kong, NGINX, Envoy) for full control at scale.
Key tool/command: kong start / nginx -s reload / envoy --config-path gateway.yaml
Watch out for: Putting business logic in the gateway — it must stay thin or it becomes a bottleneck and single point of failure.
Works with: REST, gRPC, WebSocket, GraphQL; Kubernetes or VM-based deployments; all major cloud providers.

Constraints

Never put business logic in the gateway — it must remain a thin routing and cross-cutting concerns layer, or it becomes a distributed monolith [src5]
Always deploy at least two gateway instances behind a load balancer — a single gateway is a single point of failure for your entire system [src7]
Rate limiting and authentication must be enforced at the gateway, not delegated to individual services — inconsistent enforcement creates security gaps [src2]
Do not chain multiple gateway layers (hardware LB + software proxy + API gateway) — each layer multiplies latency and failure probability [src5]
TLS termination should happen at the gateway edge — re-encrypting between gateway and internal services adds latency without meaningful security in a trusted network [src3]

Quick Reference

Component	Role	Technology Options	Scaling Strategy
Edge Load Balancer	Distributes traffic across gateway instances	AWS ALB/NLB, HAProxy, Cloudflare	Horizontal — add LB nodes or use managed
API Gateway	Routes requests, enforces policies	Kong, NGINX, Envoy, AWS API Gateway, Apigee	Horizontal — stateless instances behind LB
Auth Layer	Validates JWT/OAuth2 tokens, API keys	Gateway plugin (Kong JWT, NGINX auth_request), Keycloak, Auth0	Offload to external IdP; cache tokens at gateway
Rate Limiter	Throttles requests per consumer/IP	Gateway plugin (Kong rate-limiting), Redis-backed sliding window	Shared Redis cluster for distributed counting
Circuit Breaker	Prevents cascading failures to unhealthy services	Envoy outlier detection, Kong circuit-breaker, Hystrix	Per-service thresholds; half-open probes
Request Router	Maps URL paths/headers to backend services	Path-based routing, header-based, weighted canary	Config reload (NGINX) or dynamic API (Kong)
Response Cache	Caches repeated GET responses	Gateway cache plugin, Varnish, CDN edge cache	TTL-based eviction; cache invalidation on writes
Protocol Translator	Converts between REST, gRPC, WebSocket	Envoy gRPC-JSON transcoding, Kong gRPC-gateway	Stateless — scales with gateway instances
Observability Collector	Aggregates logs, metrics, traces	OpenTelemetry, Prometheus, Jaeger, Datadog	Sidecar or gateway plugin; async export
Service Registry	Discovers backend service addresses	Consul, Kubernetes DNS, AWS Cloud Map, etcd	Eventual consistency; health check intervals
Config Store	Stores gateway routing rules and policies	Kong DB (Postgres), etcd, Kubernetes CRDs, S3	DB replication or declarative GitOps
TLS Terminator	Handles HTTPS certificates and handshakes	Let's Encrypt + certbot, AWS ACM, NGINX ssl module	Offload to LB or gateway edge; auto-renewal

Decision Tree

START
├── Need serverless / minimal ops?
│   ├── YES → Managed gateway (AWS API Gateway, GCP Apigee, Azure API Management)
│   └── NO ↓
├── Running Kubernetes?
│   ├── YES → Kubernetes-native gateway ↓
│   │   ├── Need service mesh integration (Istio)?
│   │   │   ├── YES → Envoy Gateway / Istio Ingress Gateway
│   │   │   └── NO → Kong Ingress Controller or NGINX Ingress
│   └── NO ↓
├── Need plugin ecosystem + GUI admin?
│   ├── YES → Kong Gateway (open-source or Enterprise)
│   └── NO ↓
├── Need max raw performance + minimal footprint?
│   ├── YES → NGINX (config-file based, battle-tested)
│   └── NO ↓
├── Need gRPC-native + advanced observability?
│   ├── YES → Envoy Proxy (xDS API for dynamic config)
│   └── NO ↓
└── DEFAULT → Kong Gateway (best balance of features, ecosystem, and community support)

Managed vs Self-Hosted Comparison

Feature	Managed (AWS/GCP/Azure)	Self-Hosted (Kong/NGINX/Envoy)
Setup time	Minutes	Hours to days
Operational overhead	None (vendor-managed)	High (patching, scaling, monitoring)
Customization	Limited to vendor features	Full control over plugins and config
Latency	Adds ~5-20ms (network hop)	~1-5ms (co-located)
Cost at low volume	Pay-per-request ($3.50/M calls on AWS)	Fixed infra cost (potentially cheaper)
Cost at high volume	Expensive (>$1K/M requests)	Cheaper at scale (commodity infra)
Vendor lock-in	High	None
Multi-cloud	Vendor-specific	Portable across any infrastructure
gRPC/WebSocket support	Varies by vendor	Full native support
Plugin ecosystem	Vendor extensions only	Kong: 100+ plugins; Envoy: filters; NGINX: modules

Step-by-Step Guide

1. Define your service topology

Map all backend services, their protocols, and traffic patterns. Identify which cross-cutting concerns (auth, rate limiting, logging) are currently duplicated across services. [src1]

# service-topology.yaml — document before building
services:
  - name: user-service
    protocol: REST
    port: 8001
    health_check: /health
    avg_rps: 500
    auth: JWT
  - name: order-service
    protocol: REST
    port: 8002
    health_check: /health
    avg_rps: 2000
    auth: JWT
  - name: payment-service
    protocol: gRPC
    port: 50051
    health_check: grpc_health_v1
    avg_rps: 300
    auth: mTLS

Verify: Count total services and protocols — if >3 services or >1 protocol, an API gateway is justified.

2. Choose your gateway technology

Use the decision tree above to select managed vs self-hosted, then pick a specific implementation. [src2]

# For Kong (Docker quickstart)
docker network create kong-net
docker run -d --name kong-database \
  --network=kong-net \
  -e "POSTGRES_USER=kong" \
  -e "POSTGRES_DB=kong" \
  -e "POSTGRES_PASSWORD=kongpass" \
  postgres:16-alpine

docker run --rm --network=kong-net \
  -e "KONG_DATABASE=postgres" \
  -e "KONG_PG_HOST=kong-database" \
  -e "KONG_PG_PASSWORD=kongpass" \
  kong/kong-gateway:3.9 kong migrations bootstrap

docker run -d --name kong-gateway \
  --network=kong-net \
  -e "KONG_DATABASE=postgres" \
  -e "KONG_PG_HOST=kong-database" \
  -e "KONG_PG_PASSWORD=kongpass" \
  -e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
  -e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
  -e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
  -e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
  -e "KONG_ADMIN_LISTEN=0.0.0.0:8001" \
  -p 8000:8000 \
  -p 8001:8001 \
  kong/kong-gateway:3.9

Verify: curl -s http://localhost:8001/ | jq '.version' → expected: "3.9.x"

3. Configure service routing

# Kong Admin API — register services and routes
curl -i -X POST http://localhost:8001/services/ \
  --data name=user-service \
  --data url=http://user-service:8001

curl -i -X POST http://localhost:8001/services/user-service/routes \
  --data 'paths[]=/api/v1/users' \
  --data name=user-route

Verify: curl -s http://localhost:8001/routes | jq '.data[].name' → expected: "user-route"

4. Add authentication

Enable JWT or OAuth2 validation at the gateway so backend services receive pre-validated tokens. [src4]

# Enable JWT plugin globally on Kong
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=jwt

# Create a consumer and JWT credential
curl -i -X POST http://localhost:8001/consumers/ \
  --data username=mobile-app

curl -i -X POST http://localhost:8001/consumers/mobile-app/jwt \
  --data algorithm=HS256 \
  --data secret=your-256-bit-secret

Verify: curl -s http://localhost:8000/api/v1/users → expected: 401 Unauthorized

5. Add rate limiting

Configure per-consumer and global rate limits to prevent abuse and ensure fair resource allocation. [src4]

# Kong rate-limiting plugin — per consumer, 100 req/min
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=rate-limiting \
  --data config.minute=100 \
  --data config.policy=redis \
  --data config.redis.host=redis \
  --data config.redis.port=6379 \
  --data config.limit_by=consumer

Verify: Send 101 requests in 60 seconds → expected: 101st request returns 429 Too Many Requests

6. Enable observability

Configure logging, metrics, and distributed tracing to monitor gateway health and debug issues. [src3]

# Kong — enable Prometheus metrics plugin
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=prometheus \
  --data config.per_consumer=true

# Kong — enable correlation ID for distributed tracing
curl -i -X POST http://localhost:8001/plugins/ \
  --data name=correlation-id \
  --data config.header_name=X-Request-ID \
  --data config.generator=uuid

Verify: curl -s http://localhost:8001/metrics → expected: Prometheus-format metrics including kong_http_requests_total

7. Deploy for high availability

Run multiple gateway instances behind a load balancer with health checks. [src7]

# docker-compose.yml — HA gateway setup (excerpt)
services:
  kong-1:
    image: kong/kong-gateway:3.9
    environment:
      KONG_DATABASE: "off"
      KONG_DECLARATIVE_CONFIG: /etc/kong/kong.yml
    healthcheck:
      test: ["CMD", "kong", "health"]
      interval: 10s
  kong-2:
    image: kong/kong-gateway:3.9
    environment:
      KONG_DATABASE: "off"
      KONG_DECLARATIVE_CONFIG: /etc/kong/kong.yml
    healthcheck:
      test: ["CMD", "kong", "health"]
      interval: 10s
  load-balancer:
    image: nginx:1.27-alpine
    ports:
      - "80:80"
    depends_on:
      kong-1: { condition: service_healthy }
      kong-2: { condition: service_healthy }

Verify: Stop one Kong instance → requests continue flowing through the other instance with no client-visible errors.

Code Examples

Kong: Declarative Configuration (DB-less mode)

# kong.yml — declarative gateway config
_format_version: "3.0"
services:
  - name: user-service
    url: http://user-service:8001
    routes:
      - name: user-route
        paths: [/api/v1/users]
        strip_path: false
    plugins:
      - name: rate-limiting
        config: { minute: 100, policy: local }
plugins:
  - name: jwt
    config: { claims_to_verify: [exp] }
  - name: correlation-id
    config: { header_name: X-Request-ID, generator: uuid }

NGINX: Reverse Proxy API Gateway

# nginx-gateway.conf
upstream user_service {
    server user-service-1:8001 weight=5;
    server user-service-2:8001 weight=5;
    keepalive 32;
}
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/m;
server {
    listen 443 ssl http2;
    ssl_certificate /etc/ssl/certs/api.crt;
    ssl_certificate_key /etc/ssl/private/api.key;
    limit_req zone=api_limit burst=20 nodelay;
    location /api/v1/users {
        auth_request /auth;
        proxy_pass http://user_service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Node.js: Custom API Gateway with Express

Full script: gateway.js (65 lines)

// gateway.js — Custom API gateway with Express
const express = require("express"); // ^4.21.0
const { createProxyMiddleware } = require("http-proxy-middleware"); // ^3.0.0
const rateLimit = require("express-rate-limit"); // ^7.5.0
const jwt = require("jsonwebtoken"); // ^9.0.0
const app = express();

// Global rate limiter: 100 req/min per IP
app.use(rateLimit({ windowMs: 60000, max: 100, standardHeaders: true }));

// JWT auth middleware
function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(" ")[1];
  if (!token) return res.status(401).json({ error: "Missing token" });
  try { req.user = jwt.verify(token, process.env.JWT_SECRET); next(); }
  catch { return res.status(401).json({ error: "Invalid token" }); }
}

// Route to backend services
app.use("/api/v1/users", authenticate, createProxyMiddleware({
  target: "http://user-service:8001", changeOrigin: true
}));
app.listen(3000);

Anti-Patterns

Wrong: Embedding business logic in the gateway

// BAD — gateway performs order validation and pricing
app.post("/api/v1/orders", authenticate, async (req, res) => {
  const total = req.body.items.reduce((s, i) => s + i.price * i.qty, 0);
  if (total > req.user.creditLimit) return res.status(400).json({ error: "Exceeds limit" });
  const order = await db.orders.create({ items: req.body.items, total });
  res.json(order);
});

Correct: Gateway only routes; services own logic

// GOOD — gateway proxies to order-service which handles all business logic
app.use("/api/v1/orders", authenticate, createProxyMiddleware({
  target: "http://order-service:8002", changeOrigin: true,
}));

Wrong: No circuit breaker — cascading failures

# BAD — single upstream, no failure handling
location /api/v1/payments {
    proxy_pass http://payment-service:8003;
    # No timeout, no retry — requests hang if service is down
}

Correct: Circuit breaker with health checks and timeouts

# GOOD — upstream health checks, timeouts, failover
upstream payment_service {
    server payment-1:8003 max_fails=3 fail_timeout=30s;
    server payment-2:8003 max_fails=3 fail_timeout=30s;
    server payment-fallback:8003 backup;
}
location /api/v1/payments {
    proxy_pass http://payment_service;
    proxy_connect_timeout 5s;
    proxy_read_timeout 10s;
    proxy_next_upstream error timeout http_502 http_503;
}

Wrong: Inconsistent auth — some services unprotected

// BAD — auth enforced per-route, easy to forget new routes
app.use("/api/v1/users", authenticate, userProxy);
app.use("/api/v1/orders", authenticate, orderProxy);
app.use("/api/v1/reports", reportProxy); // No auth!
app.use("/api/v1/admin", adminProxy);    // No auth!

Correct: Global auth with explicit public exceptions

// GOOD — auth is the default; whitelist public routes
const PUBLIC = ["/health", "/api/v1/public", "/docs"];
app.use((req, res, next) => {
  if (PUBLIC.some(p => req.path.startsWith(p))) return next();
  authenticate(req, res, next);
});
app.use("/api/v1/users", userProxy);     // Automatically protected
app.use("/api/v1/admin", adminProxy);    // Automatically protected

Wrong: Single gateway instance with no redundancy

# BAD — single SPOF for all API traffic
services:
  gateway:
    image: kong/kong-gateway:3.9
    ports: ["80:8000"]
    # No replicas, no health checks, no load balancer

Correct: Multiple instances behind a load balancer

# GOOD — replicated gateway with health checks
services:
  gateway:
    image: kong/kong-gateway:3.9
    deploy:
      replicas: 3
      restart_policy: { condition: on-failure }
    healthcheck:
      test: ["CMD", "kong", "health"]
      interval: 10s

Common Pitfalls

Gateway becomes a monolith: Teams add caching, personalization, data aggregation, and a database to the gateway. Fix: audit gateway code quarterly; extract any business logic back to services. [src5]
Stale service registry: Gateway routes to IP addresses that no longer exist after scaling events. Fix: use DNS-based discovery (Kubernetes Services, Consul) or health checks to auto-remove unhealthy upstreams. [src4]
Rate limiter state lost on restart: In-memory rate limiting resets when a gateway instance restarts. Fix: use Redis-backed distributed rate limiting (config.policy=redis). [src4]
Missing correlation IDs: Without a consistent request ID header, tracing across 5+ services is impossible. Fix: generate X-Request-ID at the gateway edge and propagate to all upstreams. [src7]
Timeout cascade: Gateway timeout set higher than backend service timeout, causing the gateway to wait for an abandoned response. Fix: set gateway timeout < service timeout (e.g., gateway 10s, service 15s). [src3]
No graceful degradation: When a non-critical service is down, the entire API returns 502. Fix: implement response caching for GET requests and return stale data with a Warning header. [src2]
CORS misconfiguration: Gateway strips or fails to add CORS headers, breaking browser clients. Fix: configure CORS at the gateway level rather than in each service. [src3]
Logging sensitive data: Gateway access logs include passwords, tokens, or PII. Fix: configure log format to exclude request bodies; redact Authorization headers. [src6]

Diagnostic Commands

# Check Kong gateway health and version
curl -s http://localhost:8001/ | jq '{version, hostname, node_id}'

# List all registered services and routes
curl -s http://localhost:8001/services | jq '.data[] | {name, host, port}'
curl -s http://localhost:8001/routes | jq '.data[] | {name, paths, service}'

# Check active plugins
curl -s http://localhost:8001/plugins | jq '.data[] | {name, enabled, service}'

# Test NGINX config syntax
nginx -t

# Reload NGINX config without downtime
nginx -s reload

# Monitor gateway latency (Kong Prometheus)
curl -s http://localhost:8001/metrics | grep kong_latency

# Verify rate limiting headers
curl -v http://localhost:8000/api/v1/users 2>&1 | grep -i ratelimit

# Test circuit breaker behavior
for i in $(seq 1 10); do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8000/api/v1/payments; done

Version History & Compatibility

Gateway	Current Version	Status	Key Changes
Kong Gateway	3.9.x (Nov 2025)	Current	Dynamic app route config, AI gateway plugins
Kong Gateway	3.4.x LTS	LTS until Dec 2026	Stable; security patches only
NGINX	1.27.x (2025)	Current	HTTP/3 stable, QUIC support
Envoy	1.32.x (2025)	Current	WASM plugin support, ext_authz v3
AWS API Gateway	v2 (HTTP API)	Current	JWT authorizer built-in, 50% cheaper than v1
AWS API Gateway	v1 (REST API)	Maintained	Full feature set, usage plans, API keys

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
3+ microservices need unified auth and rate limiting	Single monolith with one backend	Simple reverse proxy (NGINX/Caddy)
Multiple client types (web, mobile, IoT) need different APIs	All clients consume the same API shape	Direct service calls with shared auth library
Need protocol translation (REST to gRPC, GraphQL federation)	All services use the same protocol internally	Service mesh sidecar (Envoy/Linkerd)
Regulatory requirement to log all API access centrally	Internal service-to-service (east-west) traffic only	Service mesh (Istio/Linkerd) for internal traffic
Need canary deployments and traffic splitting	Simple blue-green deployments	Load balancer with health checks
API monetization — need usage metering per consumer	Open internal APIs with no metering needs	Direct exposure behind WAF

Important Caveats

Managed gateways (AWS API Gateway) have hard limits: 10,000 req/s per account (soft limit, raisable), 30-second maximum integration timeout, 10MB max payload — verify these fit your use case before committing.
Kong's OpenResty/LuaJIT foundation has not received upstream LuaJIT releases since 2017 — Kong maintains its own fork, but this affects long-term technology risk assessment.
Envoy has the steepest learning curve of the three major options — its xDS API and YAML config are powerful but verbose; expect 2-3x setup time compared to Kong.
API gateways handle north-south traffic (client to service). For east-west traffic (service to service), use a service mesh. Confusing these two patterns is a common architectural mistake.
DB-less (declarative) mode for Kong eliminates the PostgreSQL dependency but sacrifices dynamic plugin configuration via Admin API — choose based on your GitOps maturity.