Serverless Application Architecture
How do I design a serverless application architecture?
TL;DR
- Bottom line: Design serverless apps as small, stateless, event-driven functions behind an API gateway, with managed services for state (databases, queues, object storage) and orchestration (Step Functions, Durable Functions) for complex workflows.
- Key tool/command:
serverless deploy(Serverless Framework) or provider-native CLIs (aws lambda,wrangler deploy,gcloud functions deploy) - Watch out for: Monolithic Lambda functions that bundle all routes into one handler -- split by domain boundary instead.
- Works with: AWS Lambda (Node.js, Python, Java, Go, .NET, Ruby), Cloudflare Workers (JS/TS/Wasm), Google Cloud Functions (Node.js, Python, Go, Java, .NET, PHP, Ruby), Azure Functions (C#, JS, Python, Java, PowerShell, TypeScript).
Constraints
- Cold starts add 100ms-10s latency on first invocation depending on runtime and memory; provision concurrency or use SnapStart/edge runtimes for latency-sensitive paths
- Execution time limits vary by provider: Lambda 15min, Cloud Functions 60min (2nd gen), Azure 10min default (unlimited on Premium), Workers 30s (standard) or 15min (Cron Triggers)
- Payload size limits: Lambda 6MB sync / 256KB async, Workers 100MB, Cloud Functions 10MB HTTP
- Stateless by design: never store session state in function memory between invocations; use external stores (DynamoDB, KV, Redis)
- VPC-attached functions may incur additional cold start overhead; use VPC only when private resource access is required
- As of August 2025, AWS bills for the Lambda INIT phase -- heavy initialization code now directly increases cost
Quick Reference
| Component | Role | Technology Options | Scaling Strategy |
|---|---|---|---|
| API Gateway | Route HTTP requests to functions | AWS API Gateway, Cloudflare Workers routing, Azure API Management, Google Cloud Endpoints | Auto-scales per request; throttle via rate limits |
| Compute (FaaS) | Execute business logic | AWS Lambda, Cloudflare Workers, Google Cloud Functions, Azure Functions, Vercel Functions | Auto-scales to concurrency limit (Lambda: 1000 default, requestable to 10K+) |
| Event Bus | Decouple producers from consumers | Amazon EventBridge, Google Pub/Sub, Azure Event Grid, Cloudflare Queues | Partition-based; scales with event throughput |
| Message Queue | Buffer async workloads | Amazon SQS, Google Cloud Tasks, Azure Service Bus, Cloudflare Queues | Scales with queue depth; configure batch size per consumer |
| Orchestration | Coordinate multi-step workflows | AWS Step Functions, Azure Durable Functions, Google Workflows, Temporal (self-hosted) | Per-execution pricing; use for saga patterns and retries |
| Object Storage | Store files, static assets | Amazon S3, Google Cloud Storage, Azure Blob, Cloudflare R2 | Unlimited; event triggers on upload (S3 notifications, GCS Pub/Sub) |
| Database | Persistent structured data | DynamoDB, Firestore, Azure Cosmos DB, PlanetScale, Supabase, Cloudflare D1 | Auto-scales on-demand (DynamoDB), connection pooling critical for SQL |
| Cache / KV | Low-latency key-value lookups | ElastiCache, Cloudflare KV, Upstash Redis, Momento, Azure Cache for Redis | Edge-distributed (KV, Momento) or regional (ElastiCache) |
| CDN / Edge | Serve static content, edge compute | CloudFront, Cloudflare CDN, Azure CDN, Cloud CDN | Global PoPs; cache invalidation via TTL or purge API |
| Auth | Identity and access control | Amazon Cognito, Auth0, Firebase Auth, Azure AD B2C, Clerk | Token-based (JWT); validate at gateway or function level |
| Observability | Logs, metrics, traces | CloudWatch, Datadog, Cloudflare Logpush, Google Cloud Logging, OpenTelemetry | Structured logging; distributed tracing with X-Ray or Jaeger |
| CI/CD | Deployment pipeline | GitHub Actions, AWS SAM, Serverless Framework, SST, Pulumi, Terraform | Infrastructure-as-code; blue/green or canary deployments |
Decision Tree
START
├── Need <50ms global latency (edge compute)?
│ ├── YES → Cloudflare Workers or Lambda@Edge / CloudFront Functions
│ └── NO ↓
├── Workload is event-driven (S3 upload, DB change, queue message)?
│ ├── YES → AWS Lambda + EventBridge, or GCP Cloud Functions + Pub/Sub
│ └── NO ↓
├── Need HTTP API with variable traffic (0 to 10K+ RPS)?
│ ├── YES → API Gateway + Lambda, or Cloudflare Workers (simpler routing)
│ └── NO ↓
├── Multi-step workflow with retries and compensation?
│ ├── YES → AWS Step Functions or Azure Durable Functions
│ └── NO ↓
├── Long-running job (>15 minutes)?
│ ├── YES → Use containers (Cloud Run, ECS Fargate, AKS) instead
│ └── NO ↓
├── Execution requires GPU?
│ ├── YES → Use dedicated instances or managed ML services
│ └── NO ↓
├── Need full OS/runtime control?
│ ├── YES → Use containers
│ └── NO ↓
└── DEFAULT → Standard serverless (Lambda, Cloud Functions, or Azure Functions)
├── <1K concurrent → Single-region, default concurrency
├── 1K-100K concurrent → Multi-region, provisioned concurrency
└── >100K concurrent → Edge compute + regional fallback
Step-by-Step Guide
1. Define function boundaries by domain
Split functions along business domain boundaries, not by HTTP method. Each function should own one bounded context (e.g., "orders", "payments", "notifications"). Avoid the monolithic Lambda anti-pattern where one function handles all routes. [src1]
project/
├── functions/
│ ├── orders/
│ │ ├── create.js
│ │ ├── get.js
│ │ └── list.js
│ ├── payments/
│ │ ├── process.js
│ │ └── webhook.js
│ └── notifications/
│ ├── send-email.js
│ └── send-push.js
├── shared/
│ ├── db.js
│ └── auth.js
└── serverless.yml
Verify: Each function file imports only the dependencies it needs -> deployment package size < 5MB per function.
2. Configure the API gateway and routing
Map HTTP routes to individual functions. Use path-based routing with the API gateway or edge router. [src2]
# serverless.yml (Serverless Framework)
service: my-serverless-app
provider:
name: aws
runtime: nodejs20.x
memorySize: 512
timeout: 29
functions:
createOrder:
handler: functions/orders/create.handler
events:
- httpApi:
path: /orders
method: POST
getOrder:
handler: functions/orders/get.handler
events:
- httpApi:
path: /orders/{id}
method: GET
Verify: curl -X POST https://your-api.execute-api.region.amazonaws.com/orders -> returns 201 Created
3. Implement stateless function handlers
Write each function as a pure request-response handler. Move initialization code (DB connections, SDK clients) outside the handler for reuse across warm invocations. [src3]
// functions/orders/create.js
const { DynamoDBClient } = require("@aws-sdk/client-dynamodb");
const { PutCommand, DynamoDBDocumentClient } = require("@aws-sdk/lib-dynamodb");
const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);
exports.handler = async (event) => {
const body = JSON.parse(event.body);
const orderId = crypto.randomUUID();
await docClient.send(new PutCommand({
TableName: process.env.ORDERS_TABLE,
Item: { id: orderId, ...body, createdAt: new Date().toISOString() }
}));
return { statusCode: 201, body: JSON.stringify({ id: orderId }) };
};
Verify: aws lambda invoke --function-name createOrder --payload '{"body":"{\"item\":\"test\"}"}' out.json -> {"statusCode": 201, ...}
4. Add async event processing
Decouple synchronous request paths from heavy processing. Use queues or event buses to trigger background functions. [src4]
// Publish event after DynamoDB write
const { EventBridgeClient, PutEventsCommand } = require("@aws-sdk/client-eventbridge");
const eb = new EventBridgeClient({});
await eb.send(new PutEventsCommand({
Entries: [{
Source: "orders.service",
DetailType: "OrderCreated",
Detail: JSON.stringify({ orderId, ...body }),
EventBusName: process.env.EVENT_BUS
}]
}));
Verify: aws events describe-rule --name OrderCreatedRule -> rule exists and is enabled
5. Set up orchestration for multi-step workflows
Use Step Functions or Durable Functions for workflows that span multiple services and need retry logic, compensation, or parallel branches. [src3]
{
"Comment": "Order processing workflow",
"StartAt": "ValidateOrder",
"States": {
"ValidateOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT:function:validateOrder",
"Next": "ProcessPayment",
"Retry": [{ "ErrorEquals": ["States.TaskFailed"], "MaxAttempts": 2 }]
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT:function:processPayment",
"Next": "SendConfirmation",
"Catch": [{ "ErrorEquals": ["PaymentFailed"], "Next": "CancelOrder" }]
},
"SendConfirmation": { "Type": "Task", "Resource": "...", "End": true },
"CancelOrder": { "Type": "Task", "Resource": "...", "End": true }
}
}
Verify: aws stepfunctions start-execution --state-machine-arn arn:aws:states:REGION:ACCOUNT:stateMachine:OrderProcessing -> execution succeeds
6. Configure observability and structured logging
Add structured JSON logging, distributed tracing, and custom metrics. Use environment variables for log level control. [src1]
// shared/logger.js
const log = (level, message, data = {}) => {
console.log(JSON.stringify({
level, message,
timestamp: new Date().toISOString(),
requestId: data.requestId || "unknown",
...data
}));
};
module.exports = {
info: (msg, data) => log("INFO", msg, data),
error: (msg, data) => log("ERROR", msg, data),
warn: (msg, data) => log("WARN", msg, data),
};
Verify: Check CloudWatch Logs -> log entries appear as structured JSON with requestId field
Code Examples
JavaScript: AWS Lambda HTTP Handler
// Input: API Gateway v2 HTTP event
// Output: JSON response with status code
exports.handler = async (event) => {
try {
const { httpMethod, pathParameters, body } = event;
const id = pathParameters?.id;
const result = await processRequest(httpMethod, id, body);
return { statusCode: 200, body: JSON.stringify(result) };
} catch (err) {
console.error(JSON.stringify({ error: err.message, stack: err.stack }));
return { statusCode: err.statusCode || 500, body: JSON.stringify({ error: err.message }) };
}
};
JavaScript: Cloudflare Worker with Router
// Input: HTTP Request at edge (330+ global locations)
// Output: JSON response with near-zero cold start
export default {
async fetch(request, env) {
const url = new URL(request.url);
const path = url.pathname;
if (path === "/api/orders" && request.method === "POST") {
const body = await request.json();
const id = crypto.randomUUID();
await env.ORDERS_KV.put(id, JSON.stringify({ id, ...body }));
return Response.json({ id }, { status: 201 });
}
if (path.startsWith("/api/orders/") && request.method === "GET") {
const id = path.split("/").pop();
const data = await env.ORDERS_KV.get(id, "json");
if (!data) return Response.json({ error: "Not found" }, { status: 404 });
return Response.json(data);
}
return Response.json({ error: "Not found" }, { status: 404 });
}
};
Python: Google Cloud Function
# Input: HTTP request via Cloud Functions 2nd gen
# Output: JSON response
import functions_framework
import json
from google.cloud import firestore
db = firestore.Client() # Initialized outside handler for reuse
@functions_framework.http
def handle_order(request):
if request.method == "POST":
data = request.get_json(silent=True) or {}
doc_ref = db.collection("orders").document()
doc_ref.set({**data, "created_at": firestore.SERVER_TIMESTAMP})
return json.dumps({"id": doc_ref.id}), 201
return json.dumps({"error": "Method not allowed"}), 405
Anti-Patterns
Wrong: Monolithic Lambda with all routes in one function
// BAD -- single function handles everything, bloated package, broad IAM permissions
exports.handler = async (event) => {
if (event.path === "/orders" && event.method === "POST") { /* ... */ }
if (event.path === "/orders" && event.method === "GET") { /* ... */ }
if (event.path === "/payments" && event.method === "POST") { /* ... */ }
if (event.path === "/users" && event.method === "GET") { /* ... */ }
// 50 more routes...
};
Correct: One function per domain action
// GOOD -- each function has minimal dependencies and least-privilege IAM
// functions/orders/create.js -- only needs dynamodb:PutItem on orders table
exports.handler = async (event) => {
const body = JSON.parse(event.body);
await docClient.send(new PutCommand({ TableName: "orders", Item: body }));
return { statusCode: 201, body: JSON.stringify({ id: body.id }) };
};
Wrong: Storing state in function memory
// BAD -- state lost on cold start or new instance
let requestCount = 0;
let cachedUser = null;
exports.handler = async (event) => {
requestCount++; // Unreliable -- resets on cold start
if (!cachedUser) cachedUser = await fetchUser(event.userId);
return { statusCode: 200, body: JSON.stringify({ count: requestCount }) };
};
Correct: External state management
// GOOD -- state in DynamoDB, cache in ElastiCache/KV
exports.handler = async (event) => {
const result = await docClient.send(new UpdateCommand({
TableName: "counters", Key: { id: "requests" },
UpdateExpression: "ADD #c :inc",
ExpressionAttributeNames: { "#c": "count" },
ExpressionAttributeValues: { ":inc": 1 },
ReturnValues: "UPDATED_NEW"
}));
return { statusCode: 200, body: JSON.stringify({ count: result.Attributes.count }) };
};
Wrong: Synchronous chain of function calls
// BAD -- function A directly invokes function B, which invokes C
// Creates tight coupling, cascading failures, and double billing
exports.handler = async (event) => {
const orderResult = await lambda.invoke({ FunctionName: "processOrder", Payload: event.body }).promise();
const paymentResult = await lambda.invoke({ FunctionName: "processPayment", Payload: orderResult.Payload }).promise();
const emailResult = await lambda.invoke({ FunctionName: "sendEmail", Payload: paymentResult.Payload }).promise();
return emailResult;
};
Correct: Event-driven decoupling or orchestration
// GOOD -- publish event, let downstream consumers react independently
exports.handler = async (event) => {
const order = await createOrder(event.body);
await eventBridge.send(new PutEventsCommand({
Entries: [{ Source: "orders", DetailType: "OrderCreated", Detail: JSON.stringify(order) }]
}));
return { statusCode: 202, body: JSON.stringify({ id: order.id, status: "processing" }) };
};
// Payment and email functions subscribe to "OrderCreated" events independently
Common Pitfalls
- Cold start latency in Java/C# runtimes: JVM and .NET CLR initialization adds 1-10s on cold start. Fix: Use Lambda SnapStart for Java, Native AOT for .NET, or switch to Node.js/Python for latency-sensitive paths. [src1]
- Uncontrolled concurrency causing downstream overload: Lambda can scale to thousands of concurrent instances, overwhelming databases with connection limits. Fix: Use
ReservedConcurrencyor connection pooling (RDS Proxy, PgBouncer). [src3] - Over-provisioning memory: Lambda allocates CPU proportionally to memory. 10GB functions cost 20x more than 512MB per ms. Fix: Use AWS Lambda Power Tuning to find the cost-optimal memory setting. [src1]
- Vendor lock-in through deep SDK coupling: Building directly against provider-specific APIs makes migration expensive. Fix: Use a repository pattern or ports-and-adapters architecture to abstract storage. [src7]
- Missing timeout and retry configuration: Default Lambda timeout is 3 seconds; default retry is 2 for async invocations. Fix: Set explicit
timeout(slightly less than API Gateway's 29s max) and configure DLQ for failed async events. [src4] - Ignoring INIT phase costs (post-August 2025): AWS now bills for Lambda initialization time. Heavy imports directly increase cost. Fix: Lazy-load heavy dependencies, use Lambda Layers, minimize top-level imports. [src1]
- Not structuring logs for observability:
console.log("error happened")is useless at scale. Fix: Use structured JSON logging with request ID, function name, and correlation IDs for distributed tracing. [src3] - Treating serverless as cheaper VMs: Serverless requires fundamentally different design -- event-driven, stateless, fine-grained. Lifting and shifting a monolith creates worse outcomes. Fix: Redesign around events and bounded contexts before migrating. [src4]
Diagnostic Commands
# Check Lambda function configuration
aws lambda get-function-configuration --function-name myFunction
# View recent invocation logs (last 5 minutes)
aws logs filter-log-events --log-group-name /aws/lambda/myFunction --start-time $(date -d '5 minutes ago' +%s000)
# Check concurrent executions
aws lambda get-account-settings | jq '.AccountLimit.ConcurrentExecutions'
# Test Cloudflare Worker locally
npx wrangler dev src/index.js
# Deploy Cloudflare Worker
npx wrangler deploy
# Check Google Cloud Function logs
gcloud functions logs read myFunction --limit 50
# Validate Serverless Framework configuration
npx serverless print
Version History & Compatibility
| Platform | Current Generation | Previous | Key Changes |
|---|---|---|---|
| AWS Lambda | Runtime API v2 (2024+) | v1 | SnapStart GA, INIT billing (Aug 2025), 10GB memory, 6 vCPUs |
| Cloudflare Workers | V8 Isolates (2024+) | Service Workers API | Durable Objects, Queues, D1, Python support (beta) |
| Google Cloud Functions | 2nd gen (CloudRun-based, 2023+) | 1st gen | 60min timeout, Eventarc triggers, concurrency per instance |
| Azure Functions | v4 (2023+) | v3 (EOL 2026-03-14) | Flex Consumption plan, .NET 8 isolated model |
| Vercel Functions | Edge Runtime (2023+) | Serverless Functions | Edge middleware, streaming responses, ISR |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Traffic is spiky or unpredictable (0 to 10K+ RPS) | Traffic is steady and predictable (always-on) | Containers on reserved instances (cheaper) |
| Individual requests complete in <15 minutes | Jobs run for hours (ML training, video transcoding) | ECS Fargate, Cloud Run jobs, or dedicated compute |
| You want zero infrastructure management | You need fine-grained OS/runtime control | Containers (Docker on ECS/GKE/AKS) |
| Event-driven processing (S3 uploads, DB changes, webhooks) | Requires persistent connections (WebSockets, gRPC streaming) | Containers or Cloudflare Durable Objects |
| Startup/small team with limited DevOps capacity | Strict latency SLA (<10ms p99) with cold start intolerance | Pre-warmed containers or dedicated instances |
| Cost optimization for low-traffic APIs (pay per invocation) | High-throughput, CPU-bound processing at scale | Reserved EC2/GCE instances (cost-effective at scale) |
Important Caveats
- AWS Lambda INIT phase billing (August 2025) changed the cost model significantly for functions with heavy initialization -- benchmark before and after
- Cloudflare Workers run on V8 isolates (not containers), which means no native binary execution, no file system access, and a different security model than Lambda
- Google Cloud Functions 2nd gen runs on Cloud Run under the hood, inheriting Cloud Run's concurrency model (multiple requests per instance) unlike Lambda's 1:1 model
- Azure Functions v3 reaches end-of-life on March 14, 2026 -- migrate to v4 immediately
- Serverless Framework, SST, and Pulumi abstract provider differences but add their own complexity and potential lock-in
- Cold start benchmarks vary significantly by region, runtime, memory allocation, and VPC configuration -- always benchmark in your specific deployment environment
- Multi-cloud serverless is possible in theory but costly in practice; pick one primary provider and use abstraction layers only where migration risk is real