kong start / nginx -s reload / envoy --config-path gateway.yaml| Component | Role | Technology Options | Scaling Strategy |
|---|---|---|---|
| Edge Load Balancer | Distributes traffic across gateway instances | AWS ALB/NLB, HAProxy, Cloudflare | Horizontal — add LB nodes or use managed |
| API Gateway | Routes requests, enforces policies | Kong, NGINX, Envoy, AWS API Gateway, Apigee | Horizontal — stateless instances behind LB |
| Auth Layer | Validates JWT/OAuth2 tokens, API keys | Gateway plugin (Kong JWT, NGINX auth_request), Keycloak, Auth0 | Offload to external IdP; cache tokens at gateway |
| Rate Limiter | Throttles requests per consumer/IP | Gateway plugin (Kong rate-limiting), Redis-backed sliding window | Shared Redis cluster for distributed counting |
| Circuit Breaker | Prevents cascading failures to unhealthy services | Envoy outlier detection, Kong circuit-breaker, Hystrix | Per-service thresholds; half-open probes |
| Request Router | Maps URL paths/headers to backend services | Path-based routing, header-based, weighted canary | Config reload (NGINX) or dynamic API (Kong) |
| Response Cache | Caches repeated GET responses | Gateway cache plugin, Varnish, CDN edge cache | TTL-based eviction; cache invalidation on writes |
| Protocol Translator | Converts between REST, gRPC, WebSocket | Envoy gRPC-JSON transcoding, Kong gRPC-gateway | Stateless — scales with gateway instances |
| Observability Collector | Aggregates logs, metrics, traces | OpenTelemetry, Prometheus, Jaeger, Datadog | Sidecar or gateway plugin; async export |
| Service Registry | Discovers backend service addresses | Consul, Kubernetes DNS, AWS Cloud Map, etcd | Eventual consistency; health check intervals |
| Config Store | Stores gateway routing rules and policies | Kong DB (Postgres), etcd, Kubernetes CRDs, S3 | DB replication or declarative GitOps |
| TLS Terminator | Handles HTTPS certificates and handshakes | Let's Encrypt + certbot, AWS ACM, NGINX ssl module | Offload to LB or gateway edge; auto-renewal |
START
├── Need serverless / minimal ops?
│ ├── YES → Managed gateway (AWS API Gateway, GCP Apigee, Azure API Management)
│ └── NO ↓
├── Running Kubernetes?
│ ├── YES → Kubernetes-native gateway ↓
│ │ ├── Need service mesh integration (Istio)?
│ │ │ ├── YES → Envoy Gateway / Istio Ingress Gateway
│ │ │ └── NO → Kong Ingress Controller or NGINX Ingress
│ └── NO ↓
├── Need plugin ecosystem + GUI admin?
│ ├── YES → Kong Gateway (open-source or Enterprise)
│ └── NO ↓
├── Need max raw performance + minimal footprint?
│ ├── YES → NGINX (config-file based, battle-tested)
│ └── NO ↓
├── Need gRPC-native + advanced observability?
│ ├── YES → Envoy Proxy (xDS API for dynamic config)
│ └── NO ↓
└── DEFAULT → Kong Gateway (best balance of features, ecosystem, and community support)
| Feature | Managed (AWS/GCP/Azure) | Self-Hosted (Kong/NGINX/Envoy) |
|---|---|---|
| Setup time | Minutes | Hours to days |
| Operational overhead | None (vendor-managed) | High (patching, scaling, monitoring) |
| Customization | Limited to vendor features | Full control over plugins and config |
| Latency | Adds ~5-20ms (network hop) | ~1-5ms (co-located) |
| Cost at low volume | Pay-per-request ($3.50/M calls on AWS) | Fixed infra cost (potentially cheaper) |
| Cost at high volume | Expensive (>$1K/M requests) | Cheaper at scale (commodity infra) |
| Vendor lock-in | High | None |
| Multi-cloud | Vendor-specific | Portable across any infrastructure |
| gRPC/WebSocket support | Varies by vendor | Full native support |
| Plugin ecosystem | Vendor extensions only | Kong: 100+ plugins; Envoy: filters; NGINX: modules |
Map all backend services, their protocols, and traffic patterns. Identify which cross-cutting concerns (auth, rate limiting, logging) are currently duplicated across services. [src1]
# service-topology.yaml — document before building
services:
- name: user-service
protocol: REST
port: 8001
health_check: /health
avg_rps: 500
auth: JWT
- name: order-service
protocol: REST
port: 8002
health_check: /health
avg_rps: 2000
auth: JWT
- name: payment-service
protocol: gRPC
port: 50051
health_check: grpc_health_v1
avg_rps: 300
auth: mTLS
Verify: Count total services and protocols — if >3 services or >1 protocol, an API gateway is justified.
Use the decision tree above to select managed vs self-hosted, then pick a specific implementation. [src2]
# For Kong (Docker quickstart)
docker network create kong-net
docker run -d --name kong-database \
--network=kong-net \
-e "POSTGRES_USER=kong" \
-e "POSTGRES_DB=kong" \
-e "POSTGRES_PASSWORD=kongpass" \
postgres:16-alpine
docker run --rm --network=kong-net \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
-e "KONG_PG_PASSWORD=kongpass" \
kong/kong-gateway:3.9 kong migrations bootstrap
docker run -d --name kong-gateway \
--network=kong-net \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
-e "KONG_PG_PASSWORD=kongpass" \
-e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
-e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
-e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_LISTEN=0.0.0.0:8001" \
-p 8000:8000 \
-p 8001:8001 \
kong/kong-gateway:3.9
Verify: curl -s http://localhost:8001/ | jq '.version' → expected: "3.9.x"
Register backend services and define routes that map incoming request paths to upstream services. [src4]
# Kong Admin API — register services and routes
curl -i -X POST http://localhost:8001/services/ \
--data name=user-service \
--data url=http://user-service:8001
curl -i -X POST http://localhost:8001/services/user-service/routes \
--data 'paths[]=/api/v1/users' \
--data name=user-route
Verify: curl -s http://localhost:8001/routes | jq '.data[].name' → expected: "user-route"
Enable JWT or OAuth2 validation at the gateway so backend services receive pre-validated tokens. [src4]
# Enable JWT plugin globally on Kong
curl -i -X POST http://localhost:8001/plugins/ \
--data name=jwt
# Create a consumer and JWT credential
curl -i -X POST http://localhost:8001/consumers/ \
--data username=mobile-app
curl -i -X POST http://localhost:8001/consumers/mobile-app/jwt \
--data algorithm=HS256 \
--data secret=your-256-bit-secret
Verify: curl -s http://localhost:8000/api/v1/users → expected: 401 Unauthorized
Configure per-consumer and global rate limits to prevent abuse and ensure fair resource allocation. [src4]
# Kong rate-limiting plugin — per consumer, 100 req/min
curl -i -X POST http://localhost:8001/plugins/ \
--data name=rate-limiting \
--data config.minute=100 \
--data config.policy=redis \
--data config.redis.host=redis \
--data config.redis.port=6379 \
--data config.limit_by=consumer
Verify: Send 101 requests in 60 seconds → expected: 101st request returns 429 Too Many Requests
Configure logging, metrics, and distributed tracing to monitor gateway health and debug issues. [src3]
# Kong — enable Prometheus metrics plugin
curl -i -X POST http://localhost:8001/plugins/ \
--data name=prometheus \
--data config.per_consumer=true
# Kong — enable correlation ID for distributed tracing
curl -i -X POST http://localhost:8001/plugins/ \
--data name=correlation-id \
--data config.header_name=X-Request-ID \
--data config.generator=uuid
Verify: curl -s http://localhost:8001/metrics → expected: Prometheus-format metrics including kong_http_requests_total
Run multiple gateway instances behind a load balancer with health checks. [src7]
# docker-compose.yml — HA gateway setup (excerpt)
services:
kong-1:
image: kong/kong-gateway:3.9
environment:
KONG_DATABASE: "off"
KONG_DECLARATIVE_CONFIG: /etc/kong/kong.yml
healthcheck:
test: ["CMD", "kong", "health"]
interval: 10s
kong-2:
image: kong/kong-gateway:3.9
environment:
KONG_DATABASE: "off"
KONG_DECLARATIVE_CONFIG: /etc/kong/kong.yml
healthcheck:
test: ["CMD", "kong", "health"]
interval: 10s
load-balancer:
image: nginx:1.27-alpine
ports:
- "80:80"
depends_on:
kong-1: { condition: service_healthy }
kong-2: { condition: service_healthy }
Verify: Stop one Kong instance → requests continue flowing through the other instance with no client-visible errors.
# kong.yml — declarative gateway config
_format_version: "3.0"
services:
- name: user-service
url: http://user-service:8001
routes:
- name: user-route
paths: [/api/v1/users]
strip_path: false
plugins:
- name: rate-limiting
config: { minute: 100, policy: local }
plugins:
- name: jwt
config: { claims_to_verify: [exp] }
- name: correlation-id
config: { header_name: X-Request-ID, generator: uuid }
# nginx-gateway.conf
upstream user_service {
server user-service-1:8001 weight=5;
server user-service-2:8001 weight=5;
keepalive 32;
}
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/m;
server {
listen 443 ssl http2;
ssl_certificate /etc/ssl/certs/api.crt;
ssl_certificate_key /etc/ssl/private/api.key;
limit_req zone=api_limit burst=20 nodelay;
location /api/v1/users {
auth_request /auth;
proxy_pass http://user_service;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Full script: gateway.js (65 lines)
// gateway.js — Custom API gateway with Express
const express = require("express"); // ^4.21.0
const { createProxyMiddleware } = require("http-proxy-middleware"); // ^3.0.0
const rateLimit = require("express-rate-limit"); // ^7.5.0
const jwt = require("jsonwebtoken"); // ^9.0.0
const app = express();
// Global rate limiter: 100 req/min per IP
app.use(rateLimit({ windowMs: 60000, max: 100, standardHeaders: true }));
// JWT auth middleware
function authenticate(req, res, next) {
const token = req.headers.authorization?.split(" ")[1];
if (!token) return res.status(401).json({ error: "Missing token" });
try { req.user = jwt.verify(token, process.env.JWT_SECRET); next(); }
catch { return res.status(401).json({ error: "Invalid token" }); }
}
// Route to backend services
app.use("/api/v1/users", authenticate, createProxyMiddleware({
target: "http://user-service:8001", changeOrigin: true
}));
app.listen(3000);
// BAD — gateway performs order validation and pricing
app.post("/api/v1/orders", authenticate, async (req, res) => {
const total = req.body.items.reduce((s, i) => s + i.price * i.qty, 0);
if (total > req.user.creditLimit) return res.status(400).json({ error: "Exceeds limit" });
const order = await db.orders.create({ items: req.body.items, total });
res.json(order);
});
// GOOD — gateway proxies to order-service which handles all business logic
app.use("/api/v1/orders", authenticate, createProxyMiddleware({
target: "http://order-service:8002", changeOrigin: true,
}));
# BAD — single upstream, no failure handling
location /api/v1/payments {
proxy_pass http://payment-service:8003;
# No timeout, no retry — requests hang if service is down
}
# GOOD — upstream health checks, timeouts, failover
upstream payment_service {
server payment-1:8003 max_fails=3 fail_timeout=30s;
server payment-2:8003 max_fails=3 fail_timeout=30s;
server payment-fallback:8003 backup;
}
location /api/v1/payments {
proxy_pass http://payment_service;
proxy_connect_timeout 5s;
proxy_read_timeout 10s;
proxy_next_upstream error timeout http_502 http_503;
}
// BAD — auth enforced per-route, easy to forget new routes
app.use("/api/v1/users", authenticate, userProxy);
app.use("/api/v1/orders", authenticate, orderProxy);
app.use("/api/v1/reports", reportProxy); // No auth!
app.use("/api/v1/admin", adminProxy); // No auth!
// GOOD — auth is the default; whitelist public routes
const PUBLIC = ["/health", "/api/v1/public", "/docs"];
app.use((req, res, next) => {
if (PUBLIC.some(p => req.path.startsWith(p))) return next();
authenticate(req, res, next);
});
app.use("/api/v1/users", userProxy); // Automatically protected
app.use("/api/v1/admin", adminProxy); // Automatically protected
# BAD — single SPOF for all API traffic
services:
gateway:
image: kong/kong-gateway:3.9
ports: ["80:8000"]
# No replicas, no health checks, no load balancer
# GOOD — replicated gateway with health checks
services:
gateway:
image: kong/kong-gateway:3.9
deploy:
replicas: 3
restart_policy: { condition: on-failure }
healthcheck:
test: ["CMD", "kong", "health"]
interval: 10s
config.policy=redis). [src4]X-Request-ID at the gateway edge and propagate to all upstreams. [src7]Warning header. [src2]Authorization headers. [src6]# Check Kong gateway health and version
curl -s http://localhost:8001/ | jq '{version, hostname, node_id}'
# List all registered services and routes
curl -s http://localhost:8001/services | jq '.data[] | {name, host, port}'
curl -s http://localhost:8001/routes | jq '.data[] | {name, paths, service}'
# Check active plugins
curl -s http://localhost:8001/plugins | jq '.data[] | {name, enabled, service}'
# Test NGINX config syntax
nginx -t
# Reload NGINX config without downtime
nginx -s reload
# Monitor gateway latency (Kong Prometheus)
curl -s http://localhost:8001/metrics | grep kong_latency
# Verify rate limiting headers
curl -v http://localhost:8000/api/v1/users 2>&1 | grep -i ratelimit
# Test circuit breaker behavior
for i in $(seq 1 10); do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8000/api/v1/payments; done
| Gateway | Current Version | Status | Key Changes |
|---|---|---|---|
| Kong Gateway | 3.9.x (Nov 2025) | Current | Dynamic app route config, AI gateway plugins |
| Kong Gateway | 3.4.x LTS | LTS until Dec 2026 | Stable; security patches only |
| NGINX | 1.27.x (2025) | Current | HTTP/3 stable, QUIC support |
| Envoy | 1.32.x (2025) | Current | WASM plugin support, ext_authz v3 |
| AWS API Gateway | v2 (HTTP API) | Current | JWT authorizer built-in, 50% cheaper than v1 |
| AWS API Gateway | v1 (REST API) | Maintained | Full feature set, usage plans, API keys |
| Use When | Don't Use When | Use Instead |
|---|---|---|
| 3+ microservices need unified auth and rate limiting | Single monolith with one backend | Simple reverse proxy (NGINX/Caddy) |
| Multiple client types (web, mobile, IoT) need different APIs | All clients consume the same API shape | Direct service calls with shared auth library |
| Need protocol translation (REST to gRPC, GraphQL federation) | All services use the same protocol internally | Service mesh sidecar (Envoy/Linkerd) |
| Regulatory requirement to log all API access centrally | Internal service-to-service (east-west) traffic only | Service mesh (Istio/Linkerd) for internal traffic |
| Need canary deployments and traffic splitting | Simple blue-green deployments | Load balancer with health checks |
| API monetization — need usage metering per consumer | Open internal APIs with no metering needs | Direct exposure behind WAF |