docker-compose up with separate containers per service, or Kubernetes with Helm charts for production-grade orchestration.| Component | Role | Technology Options | Scaling Strategy |
|---|---|---|---|
| API Gateway | Route requests, rate limit, auth, SSL termination | Kong, AWS API Gateway, NGINX, Envoy | Horizontal — stateless, add instances behind LB |
| Product Catalog Service | CRUD products, categories, attributes, pricing | Node.js/Python + PostgreSQL or MongoDB | Read replicas + CDN cache; write sharding by category |
| Search Service | Full-text search, faceted filtering, autocomplete | Elasticsearch, OpenSearch, Typesense | Horizontal sharding by index; read replicas |
| User/Auth Service | Registration, login, JWT/OAuth, profiles | Node.js/Go + PostgreSQL + Redis (sessions) | Horizontal — stateless with token-based auth |
| Cart Service | Add/remove items, persist cart state, price calc | Node.js/Python + Redis (primary) + PostgreSQL (backup) | Horizontal — partition by user ID in Redis Cluster |
| Order Service | Order creation, lifecycle management, history | Python/Java + PostgreSQL (ACID) | Shard by order ID; archive old orders to cold storage |
| Payment Service | Gateway integration, tokenization, refunds | Node.js/Go + PostgreSQL + external gateway (Stripe) | Horizontal — idempotency keys prevent duplicates |
| Inventory Service | Stock levels, reservations, warehouse sync | Go/Java + PostgreSQL + Redis (hot counts) | Optimistic locking; shard by SKU range |
| Notification Service | Email, SMS, push notifications, webhooks | Node.js/Python + queue consumer (SQS/RabbitMQ) | Scale consumers independently based on queue depth |
| Recommendation Service | Personalized suggestions, "also bought" | Python (ML) + Redis (feature store) + Spark | Precompute offline; serve from cache; scale reads |
| CDN / Edge | Static assets, image delivery, edge caching | CloudFront, Cloudflare, Fastly | Automatic — scales with traffic globally |
| Message Broker | Async inter-service events, order saga coordination | Kafka, RabbitMQ, AWS SQS/SNS | Kafka: add partitions; RabbitMQ: add consumers |
| Monitoring & Observability | Distributed tracing, metrics, alerting, logging | Datadog, Grafana+Prometheus, Jaeger, ELK Stack | Scale collectors; sample traces at high volume |
START
├── Expected daily orders < 100 and products < 1K?
│ ├── YES → Use managed platform (Shopify/WooCommerce)
│ └── NO ↓
├── Team size < 5 backend engineers?
│ ├── YES → Modular monolith (single deploy, domain modules, shared DB with schema separation)
│ └── NO ↓
├── < 1K concurrent users?
│ ├── YES → Modular monolith with clear domain boundaries, prepare for future extraction
│ └── NO ↓
├── 1K–50K concurrent users?
│ ├── YES → Extract high-load services first (search, catalog, cart) as microservices
│ └── NO ↓
├── 50K–500K concurrent users?
│ ├── YES → Full microservices with Kafka event bus, database-per-service, Kubernetes
│ └── NO ↓
├── > 500K concurrent users?
│ ├── YES → Microservices + CQRS/Event Sourcing, multi-region, database sharding
│ └── NO ↓
└── DEFAULT → Start with modular monolith, extract services as bottlenecks emerge
Map your e-commerce domain into distinct bounded contexts using Domain-Driven Design (DDD). Each context becomes a service boundary with its own database. The critical contexts are: Product Catalog, Shopping Cart, Order Management, Payment, Inventory, User/Auth, Search, and Notifications. [src3]
Bounded Contexts:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Catalog │ │ Cart │ │ Order │
│ (MongoDB) │ │ (Redis) │ │ (PostgreSQL) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────── API Gateway ────────────┘
│
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Payment │ │ Inventory │ │ Search │
│ (PostgreSQL) │ │ (PostgreSQL) │ │(Elasticsearch│
└──────────────┘ └──────────────┘ └──────────────┘
Verify: Each service can be deployed and tested independently — no compile-time dependencies between services.
Place an API gateway in front of all services to handle authentication, rate limiting, request routing, and SSL termination. Use path-based routing. [src1]
# Kong or AWS API Gateway route config (conceptual)
routes:
- path: /api/v1/products
service: catalog-service
methods: [GET]
plugins: [rate-limit, jwt-auth, response-cache]
- path: /api/v1/cart
service: cart-service
methods: [GET, POST, PUT, DELETE]
plugins: [rate-limit, jwt-auth]
- path: /api/v1/orders
service: order-service
methods: [GET, POST]
plugins: [rate-limit, jwt-auth]
- path: /api/v1/checkout
service: payment-service
methods: [POST]
plugins: [rate-limit, jwt-auth, idempotency]
Verify: curl -H "Authorization: Bearer <token>" https://api.example.com/api/v1/products → returns product list with 200 OK.
The catalog service stores products in a primary database and syncs changes to Elasticsearch for full-text search. Use Change Data Capture (CDC) or event publishing to keep the search index in sync. [src6]
# catalog_service/events.py — Publish product changes to message broker
import json
from kafka import KafkaProducer
producer = KafkaProducer(
bootstrap_servers=["kafka:9092"],
value_serializer=lambda v: json.dumps(v).encode("utf-8"),
)
def publish_product_event(event_type: str, product: dict):
producer.send("product-events", value={"event": event_type, "product": product})
producer.flush()
Verify: Create a product via API → curl localhost:9200/products/_search?q=<name> → product appears within 2 seconds.
Use Redis as the primary store for shopping carts with sub-millisecond reads and built-in TTL for cart expiry. Back up cart data to PostgreSQL for carts older than 30 minutes. [src2]
# cart_service/cart.py — Redis-backed cart
import redis
r = redis.Redis(host="redis", port=6379, db=0, decode_responses=True)
CART_TTL = 86400 * 7 # 7 days
def add_to_cart(user_id: str, product_id: str, quantity: int):
cart_key = f"cart:{user_id}"
r.hset(cart_key, product_id, quantity)
r.expire(cart_key, CART_TTL)
def get_cart(user_id: str) -> dict:
return {pid: int(qty) for pid, qty in r.hgetall(f"cart:{user_id}").items()}
Verify: add_to_cart("user123", "SKU-001", 2) then get_cart("user123") → {"SKU-001": 2}.
The checkout flow spans multiple services and cannot use a single database transaction. Use the Saga pattern: reserve inventory → process payment → create order. If payment fails, release the inventory reservation. [src3]
Checkout Saga Flow:
1. Cart Service → Validate cart items and prices
2. Inventory Svc → Reserve stock (soft lock with TTL)
3. Payment Service → Charge customer via gateway
├── SUCCESS → 4. Order Service → Create order record
│ 5. Inventory Svc → Confirm reservation
│ 6. Cart Service → Clear cart
│ 7. Notification → Send confirmation email
└── FAILURE → Compensate:
- Inventory Svc → Release reservation
- Notification → Send failure notice
Verify: Place test order → inventory decremented, payment captured, order record exists, cart cleared. Simulate payment failure → inventory restored.
Use Apache Kafka as the central event bus. Services publish domain events (OrderCreated, PaymentProcessed, InventoryReserved) and other services subscribe to react asynchronously. [src2]
# order_service/events.py — Consume payment events
from kafka import KafkaConsumer
import json
consumer = KafkaConsumer(
"payment-events",
bootstrap_servers=["kafka:9092"],
group_id="order-service",
value_deserializer=lambda m: json.loads(m.decode("utf-8")),
)
for message in consumer:
event = message.value
if event["type"] == "PaymentSucceeded":
create_order(event["order_id"], event["items"], event["total"])
consumer.commit()
elif event["type"] == "PaymentFailed":
release_inventory(event["order_id"], event["items"])
consumer.commit()
Verify: Publish a PaymentSucceeded event → order record appears in database within 5 seconds.
Package each service as a Docker container and orchestrate with Kubernetes. Use Horizontal Pod Autoscalers (HPA) to scale based on CPU/memory or custom metrics. [src4]
# k8s/catalog-service.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: catalog-service
spec:
replicas: 3
selector:
matchLabels:
app: catalog-service
template:
spec:
containers:
- name: catalog
image: ecommerce/catalog-service:1.0.0
ports:
- containerPort: 8080
resources:
requests: { memory: "256Mi", cpu: "250m" }
limits: { memory: "512Mi", cpu: "500m" }
Verify: kubectl get hpa → shows catalog-hpa. Under load: kubectl get pods -l app=catalog-service → pod count increases.
# order_service/saga.py — Checkout saga orchestrator
# Input: Cart contents (user_id, items), payment method token
# Output: Order confirmation or compensated failure
import httpx
import uuid
INVENTORY_URL = "http://inventory-service:8080"
PAYMENT_URL = "http://payment-service:8080"
async def checkout_saga(user_id: str, items: list, payment_token: str):
saga_id = str(uuid.uuid4())
# Step 1: Reserve inventory
res = await httpx.AsyncClient().post(
f"{INVENTORY_URL}/reserve", json={"saga_id": saga_id, "items": items})
if res.status_code != 200:
return {"status": "failed", "reason": "inventory_unavailable"}
# Step 2: Process payment
res = await httpx.AsyncClient().post(f"{PAYMENT_URL}/charge", json={
"saga_id": saga_id, "token": payment_token,
"amount": sum(i["price"] * i["qty"] for i in items)})
if res.status_code != 200:
await httpx.AsyncClient().post(
f"{INVENTORY_URL}/release", json={"saga_id": saga_id})
return {"status": "failed", "reason": "payment_declined"}
return {"status": "confirmed", "order_id": res.json()["order_id"]}
// inventory_service/reserve.js — Atomic inventory reservation
// Input: saga_id, items [{sku, qty}]
// Output: reservation confirmation or rejection
const { Pool } = require("pg"); // [email protected]
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
async function reserveInventory(sagaId, items) {
const client = await pool.connect();
try {
await client.query("BEGIN");
for (const item of items) {
const result = await client.query(
`UPDATE inventory SET reserved = reserved + $1, updated_at = NOW()
WHERE sku = $2 AND (stock - reserved) >= $1 RETURNING sku`,
[item.qty, item.sku]);
if (result.rowCount === 0) {
await client.query("ROLLBACK");
return { success: false, reason: `Insufficient stock: ${item.sku}` };
}
}
await client.query(
`INSERT INTO reservations (saga_id, items, status) VALUES ($1, $2, 'reserved')`,
[sagaId, JSON.stringify(items)]);
await client.query("COMMIT");
return { success: true, saga_id: sagaId };
} catch (err) { await client.query("ROLLBACK"); throw err; }
finally { client.release(); }
}
// BAD — All services read/write the same database
// Creates coupling: schema changes break all services,
// impossible to scale services independently,
// single point of failure
// GOOD — Each service owns its data, syncs via events
// Catalog (MongoDB), Order (PostgreSQL), Inventory (PostgreSQL)
// Connected via Kafka Event Bus
// Independent scaling, deployment, and schema evolution
# BAD — Blocking HTTP calls chain during checkout
def checkout(cart):
inventory = requests.post("/inventory/reserve", json=cart) # blocks
payment = requests.post("/payment/charge", json=cart) # blocks
order = requests.post("/orders/create", json=cart) # blocks
# If payment service is slow, entire checkout hangs.
# If order service fails after payment, no compensation.
return order
# GOOD — Saga orchestrator with compensation on failure
async def checkout_saga(cart, payment_token):
saga_id = uuid.uuid4()
try:
await reserve_inventory(saga_id, cart.items)
payment = await process_payment(saga_id, payment_token, cart.total)
order = await create_order(saga_id, cart, payment.id)
return {"status": "confirmed", "order_id": order.id}
except PaymentFailedError:
await release_inventory(saga_id) # compensating transaction
return {"status": "failed", "reason": "payment_declined"}
// BAD — Cart only in browser localStorage
localStorage.setItem("cart", JSON.stringify(cartItems));
// Lost on device switch, browser clear, or incognito.
// No server validation of prices. No abandoned cart analytics.
// GOOD — Server-side cart (Redis) with client-side sync
async function addToCart(productId, qty) {
const res = await fetch("/api/cart", {
method: "POST",
body: JSON.stringify({ product_id: productId, qty }),
headers: { "Authorization": `Bearer ${token}` },
});
const cart = await res.json();
sessionStorage.setItem("cart_cache", JSON.stringify(cart));
return cart;
}
UPDATE ... WHERE stock >= qty with row-level locking in PostgreSQL. [src2]# Check service health across all microservices
for svc in catalog cart order payment inventory search; do
curl -s "http://${svc}-service:8080/health" | jq '.status'
done
# Monitor Kafka consumer lag (detect processing bottlenecks)
kafka-consumer-groups.sh --bootstrap-server kafka:9092 \
--describe --group order-service
# Check PostgreSQL active connections and locks
psql -c "SELECT pid, state, query FROM pg_stat_activity WHERE state != 'idle';"
# Redis memory and cart key count
redis-cli INFO memory | grep used_memory_human
redis-cli DBSIZE
# Elasticsearch cluster health
curl -s localhost:9200/_cluster/health | jq '.status,.active_shards'
# Kubernetes pod status
kubectl get pods -n ecommerce -o wide
kubectl get events -n ecommerce --sort-by='.lastTimestamp' | tail -20
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Building a custom e-commerce platform with >1K daily orders | Selling <100 products with simple needs | Shopify, WooCommerce, or BigCommerce |
| Team has 5+ backend engineers and DevOps capability | Solo developer or small team without Kubernetes experience | Modular monolith or managed platform |
| Need independent scaling of catalog, search, and checkout | All components have similar load patterns | Modular monolith with domain modules |
| Regulatory requirements demand service isolation (PCI scope reduction) | No compliance requirements and simple payment flow | Monolith with Stripe Checkout |
| Multi-region deployment required for <100ms latency globally | Single-region audience with acceptable latency | Single-region deployment with CDN |
| Flash sales or highly variable traffic patterns | Steady, predictable traffic with no spikes | Fixed-size deployment with load balancer |