Microservices Communication Patterns

Type: Software Reference Confidence: 0.93 Sources: 7 Verified: 2026-02-23 Freshness: 2026-02-23

TL;DR

Constraints

Quick Reference

PatternTypeProtocolLatencyThroughputCouplingBest ForTechnology Options
REST (JSON/HTTP)Sync request-responseHTTP/1.1+10-100msModerate (~2-4K rps/instance)Temporal + spatialPublic APIs, CRUD operationsExpress, FastAPI, Spring Boot, Go net/http
gRPC (Protobuf)Sync request-responseHTTP/21-20msHigh (~7-9K rps/instance)Temporal + spatialInternal service calls, polyglotgrpc-go, grpc-java, grpc-node, grpc-python
gRPC StreamingSync bidirectionalHTTP/2Sub-ms per msgVery highTemporal + spatialReal-time data feeds, chatSame gRPC libs + streaming APIs
GraphQLSync request-responseHTTP/1.1+10-200msModerateTemporal + spatialAPI aggregation, BFF patternApollo Server, Hasura, graphql-go
Message Queue (P2P)Async point-to-pointAMQP/STOMP5-50msHighLooseTask distribution, work queuesRabbitMQ, Amazon SQS, Azure Service Bus
Pub/Sub Event BusAsync publish-subscribeKafka/NATS2-20msVery high (1M+ msgs/s)Very looseEvent-driven, domain eventsApache Kafka, NATS, Google Pub/Sub
Event SourcingAsync event logKafka/EventStore10-100msHighVery looseAudit trails, CQRSEventStoreDB, Kafka + custom projections
Saga (Choreography)Async distributed txEvents via broker100ms-10sModerateLooseMulti-service transactionsKafka, RabbitMQ + saga state tracking
Saga (Orchestration)Mixed sync+asyncHTTP/gRPC + broker50ms-5sModerateModerateComplex workflows, compensationsTemporal, Camunda, Step Functions
Service Mesh SidecarSync (transparent)HTTP/2, mTLS+1-3ms overheadProxiedLoose (infra)mTLS, retries, observabilityIstio, Linkerd, Consul Connect
WebhooksAsync callbackHTTP/1.1+100ms-30sLow-moderateLooseExternal integrations, notificationsCustom HTTP endpoints
Shared DatabaseSync shared stateSQL/NoSQL1-10msHighVery tightLegacy migration only (anti-pattern)PostgreSQL, MongoDB (avoid in greenfield)

Decision Tree

START
├── Need immediate response (request-response)?
│   ├── YES → Is this a public/external API?
│   │   ├── YES → REST (JSON over HTTP) -- universal client support
│   │   └── NO → Internal service-to-service?
│   │       ├── YES → Need streaming or bidirectional?
│   │       │   ├── YES → gRPC Streaming
│   │       │   └── NO → Is latency critical (<10ms)?
│   │       │       ├── YES → gRPC (Protobuf) -- 2-7x faster than REST
│   │       │       └── NO → gRPC preferred, REST acceptable
│   │       └── NO → API aggregation (BFF)?
│   │           ├── YES → GraphQL or API Gateway composition
│   │           └── NO → REST
│   └── NO → Fire-and-forget or event notification?
│       ├── YES → Single consumer (task queue)?
│       │   ├── YES → Message Queue (RabbitMQ, SQS)
│       │   └── NO → Multiple consumers need same event?
│       │       ├── YES → Pub/Sub Event Bus (Kafka, NATS)
│       │       └── NO → Point-to-point queue
│       └── NO → Multi-service transaction (saga)?
│           ├── YES → Simple flow (3-4 steps)?
│           │   ├── YES → Choreography-based Saga (events)
│           │   └── NO → Orchestration-based Saga (Temporal, Step Functions)
│           └── NO → Need audit trail / replay?
│               ├── YES → Event Sourcing (EventStoreDB, Kafka)
│               └── NO → Standard Pub/Sub

Step-by-Step Guide

1. Define service boundaries and communication needs

Map each service interaction as synchronous (needs response) or asynchronous (fire-and-forget / eventual). Draw a service dependency graph. Any cycle indicates incorrect boundaries. [src3]

Service A --[sync query]--> Service B
Service A --[async event]--> Event Bus --[subscribe]--> Service C
Service A --[async event]--> Event Bus --[subscribe]--> Service D

Verify: No service should have more than 2 synchronous downstream dependencies. Count sync edges per node.

2. Set up synchronous communication (gRPC)

Define service contracts using Protocol Buffers. gRPC generates client/server stubs in all major languages from a single .proto file. [src4]

// order_service.proto
syntax = "proto3";
package order;

service OrderService {
  rpc GetOrder (GetOrderRequest) returns (OrderResponse);
  rpc CreateOrder (CreateOrderRequest) returns (OrderResponse);
  rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderEvent);
}

message GetOrderRequest {
  string order_id = 1;
}

message CreateOrderRequest {
  string customer_id = 1;
  repeated OrderItem items = 2;
}

message OrderItem {
  string product_id = 1;
  int32 quantity = 2;
  int64 price_cents = 3;
}

message OrderResponse {
  string order_id = 1;
  string status = 2;
  int64 total_cents = 3;
  string created_at = 4;
}

message OrderEvent {
  string order_id = 1;
  string event_type = 2;
  string timestamp = 3;
}

Verify: protoc --lint_out=. order_service.proto -- no warnings. Generate stubs: protoc --go_out=. --go-grpc_out=. order_service.proto

3. Set up asynchronous communication (Event Bus)

Choose a message broker based on your throughput and ordering needs. Kafka for high-throughput ordered event streams; RabbitMQ for flexible routing and work queues. [src5]

# docker-compose.yml -- local development Kafka setup (KRaft mode, no ZooKeeper)
services:
  kafka:
    image: apache/kafka:3.7.0
    ports:
      - "9092:9092"
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: broker,controller
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@localhost:9093
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_LOG_DIRS: /tmp/kraft-logs
      CLUSTER_ID: MkU3OEVBNTcwNTJENDM2Qk

Verify: docker exec kafka kafka-topics.sh --bootstrap-server localhost:9092 --list returns empty list (broker is healthy).

4. Implement circuit breakers on synchronous paths

Wrap every synchronous outbound call with a circuit breaker. This prevents cascading failures when a downstream service is slow or unavailable. [src1]

# Python with tenacity + custom circuit breaker
import httpx
from tenacity import retry, stop_after_attempt, wait_exponential, CircuitBreaker

breaker = CircuitBreaker(fail_max=5, reset_timeout=30)

@breaker
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=0.5, max=5))
async def call_inventory_service(product_id: str) -> dict:
    async with httpx.AsyncClient(timeout=5.0) as client:
        resp = await client.get(f"http://inventory-service/api/v1/stock/{product_id}")
        resp.raise_for_status()
        return resp.json()

Verify: Kill inventory service, call 6 times -> circuit opens. Wait 30s -> circuit half-opens.

5. Implement idempotent consumers

Every async message consumer must handle duplicate deliveries gracefully. Use an idempotency key (event ID) stored in a deduplication table. [src6]

# Idempotent Kafka consumer with deduplication
import json
from kafka import KafkaConsumer

consumer = KafkaConsumer(
    'order-events',
    bootstrap_servers='localhost:9092',
    group_id='payment-service',
    auto_offset_commit=False,
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

processed_ids = set()  # In production: use Redis or DB table

for message in consumer:
    event = message.value
    event_id = event['event_id']

    if event_id in processed_ids:
        consumer.commit()  # Skip duplicate, commit offset
        continue

    process_payment(event)
    processed_ids.add(event_id)
    consumer.commit()

Verify: Publish same event twice with identical event_id -> consumer processes it only once.

6. Add observability (distributed tracing)

Propagate trace context (W3C Trace Context or B3) across all communication boundaries -- sync and async. Without this, debugging cross-service issues is nearly impossible. [src2]

# OpenTelemetry instrumentation for gRPC + Kafka
from opentelemetry import trace
from opentelemetry.instrumentation.grpc import GrpcInstrumentorClient
from opentelemetry.instrumentation.kafka import KafkaInstrumentor

# Auto-instrument gRPC client calls
GrpcInstrumentorClient().instrument()

# Auto-instrument Kafka producer/consumer
KafkaInstrumentor().instrument()

# Manual span for business logic
tracer = trace.get_tracer("order-service")
with tracer.start_as_current_span("process_order") as span:
    span.set_attribute("order.id", order_id)
    # gRPC call -- trace context propagated automatically
    inventory = inventory_stub.CheckStock(request)
    # Kafka publish -- trace context injected into headers
    producer.send('order-events', value=event)

Verify: curl http://jaeger:16686/api/traces?service=order-service -> traces span across services.

Code Examples

Python: Async Event Handler with Kafka

# Input:  Kafka messages on 'order-created' topic
# Output: Payment processing + 'payment-completed' event

import asyncio
import json
from aiokafka import AIOKafkaConsumer, AIOKafkaProducer

async def payment_event_handler():
    consumer = AIOKafkaConsumer(
        'order-created',
        bootstrap_servers='kafka:9092',
        group_id='payment-service',
        value_deserializer=lambda m: json.loads(m.decode())
    )
    producer = AIOKafkaProducer(
        bootstrap_servers='kafka:9092',
        value_serializer=lambda v: json.dumps(v).encode()
    )
    await consumer.start()
    await producer.start()
    try:
        async for msg in consumer:
            order = msg.value
            result = await process_payment(order['order_id'], order['total_cents'])
            await producer.send('payment-completed', value={
                'order_id': order['order_id'],
                'payment_id': result['payment_id'],
                'status': 'completed'
            })
    finally:
        await consumer.stop()
        await producer.stop()

Go: gRPC Server with Interceptors

// Input:  gRPC requests to OrderService
// Output: Order responses with circuit breaker + logging

package main

import (
    "context"
    "log"
    "net"
    "time"

    "google.golang.org/grpc"
    "google.golang.org/grpc/codes"
    "google.golang.org/grpc/status"
    pb "myapp/proto/order"
)

func unaryInterceptor(ctx context.Context, req interface{},
    info *grpc.UnaryServerInfo, handler grpc.UnaryHandler,
) (interface{}, error) {
    start := time.Now()
    resp, err := handler(ctx, req)
    log.Printf("method=%s duration=%s error=%v",
        info.FullMethod, time.Since(start), err)
    return resp, err
}

func main() {
    lis, _ := net.Listen("tcp", ":50051")
    srv := grpc.NewServer(grpc.UnaryInterceptor(unaryInterceptor))
    pb.RegisterOrderServiceServer(srv, &orderServer{})
    log.Fatal(srv.Serve(lis))
}

TypeScript: REST with Circuit Breaker (Node.js)

// Input:  HTTP requests to downstream services
// Output: Resilient responses with fallback on failure

import CircuitBreaker from 'opossum';

const breakerOptions = {
  timeout: 3000,       // 3s timeout per request
  errorThresholdPercentage: 50,
  resetTimeout: 30000  // 30s before half-open
};

async function fetchInventory(productId: string): Promise<InventoryResponse> {
  const res = await fetch(`http://inventory-svc/api/v1/stock/${productId}`);
  if (!res.ok) throw new Error(`Inventory service error: ${res.status}`);
  return res.json();
}

const breaker = new CircuitBreaker(fetchInventory, breakerOptions);
breaker.fallback((productId: string) => ({ productId, inStock: null, cached: true }));
breaker.on('open', () => console.warn('Circuit OPEN: inventory-svc'));

// Usage: const stock = await breaker.fire('product-123');

Anti-Patterns

Wrong: Distributed Monolith (synchronous call chains)

// BAD -- 5-hop synchronous chain: one slow service kills everything
// Order -> Inventory -> Pricing -> Tax -> Shipping -> Notification
// Total latency = sum of all latencies; one failure = total failure

POST /orders
  -> GET inventory-svc/stock/{id}          // 50ms
    -> GET pricing-svc/price/{id}          // 30ms
      -> GET tax-svc/calculate             // 40ms
        -> POST shipping-svc/estimate      // 100ms
          -> POST notification-svc/send    // 200ms
// Total: 420ms best case. Any timeout cascades upward.

Correct: Hybrid sync + async with bounded sync depth

// GOOD -- Max 1 sync hop; rest is async via events
POST /orders
  -> GET inventory-svc/stock/{id}    // 1 sync hop (needs real-time answer)
  <- 201 Created (return to client)

  -> publish 'order-created' event   // Async from here
     -> payment-svc consumes         // Independent
     -> shipping-svc consumes        // Independent
     -> notification-svc consumes    // Independent
// Total sync latency: ~50ms. Async services process in parallel.

Wrong: No idempotency on consumers

# BAD -- processing duplicate messages charges customer twice
def handle_payment(event):
    charge_customer(event['customer_id'], event['amount'])  # No dedup!
    db.insert('payments', event)  # Duplicate row on retry

Correct: Idempotent consumer with deduplication

# GOOD -- idempotency key prevents double-processing
def handle_payment(event):
    if db.exists('processed_events', event['event_id']):
        return  # Already processed, skip
    charge_customer(event['customer_id'], event['amount'])
    db.insert('payments', {**event, 'processed_at': now()})
    db.insert('processed_events', {'event_id': event['event_id']})
    # Use a DB transaction to make both inserts atomic

Wrong: Exposing gRPC directly to browsers

// BAD -- browsers don't support HTTP/2 trailers (gRPC requirement)
// Frontend JS cannot call gRPC endpoints directly
const client = new OrderServiceClient('https://api.example.com:50051');
// This fails: browsers use HTTP/1.1 or HTTP/2 without trailer support

Correct: Use gRPC-Web or REST gateway for browser clients

// GOOD -- Envoy proxy transcodes gRPC to gRPC-Web for browsers
// envoy.yaml
listeners:
  - filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          http_filters:
            - name: envoy.filters.http.grpc_web    # Transcodes for browsers
            - name: envoy.filters.http.router

// OR: Use an API gateway that exposes REST -> gRPC internally
// Browser -> REST (API Gateway) -> gRPC (internal services)

Wrong: Shared database between microservices

// BAD -- two services reading/writing the same 'orders' table
// Order Service and Shipping Service both do:
SELECT * FROM orders WHERE status = 'pending';
UPDATE orders SET status = 'shipped' WHERE id = ?;
// Schema changes in one service break the other. Tight coupling.

Correct: Each service owns its data; communicate via events

// GOOD -- Order Service owns 'orders' table
// Shipping Service owns 'shipments' table
// Communication via events:

// Order Service publishes:
{ "event": "order_placed", "order_id": "123", "items": [...] }

// Shipping Service consumes event, writes to its own table:
INSERT INTO shipments (order_id, status) VALUES ('123', 'pending');
// No shared database. Schema changes are independent.

Common Pitfalls

Diagnostic Commands

# Check if gRPC service is healthy
grpcurl -plaintext localhost:50051 grpc.health.v1.Health/Check

# List available gRPC services
grpcurl -plaintext localhost:50051 list

# Check Kafka topic lag (consumer behind producer)
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group payment-service

# Check RabbitMQ queue depth
rabbitmqctl list_queues name messages_ready messages_unacknowledged

# Test REST endpoint with timing
curl -w "\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
  http://order-service:8080/api/v1/orders/123

# Check service mesh proxy status (Istio)
istioctl proxy-status

# Verify mTLS between services (Istio)
istioctl authn tls-check order-service.default.svc.cluster.local

Version History & Compatibility

TechnologyCurrent VersionKey ChangeNotes
gRPCv1.62+ (2024)xDS load balancing by defaultEnable via GRPC_XDS_BOOTSTRAP env var
Apache Kafka3.7+ (2024)KRaft mode GA (no ZooKeeper)Migrate from ZooKeeper before Kafka 4.0 removes support
RabbitMQ3.13+ (2024)Khepri metadata store (replaces Mnesia)Optional; improves cluster stability
Istio1.21+ (2024)Ambient mesh (sidecar-less option)Reduces per-pod overhead by ~50%
gRPC-Web1.5+ (2023)Stable for productionUse with Envoy or grpc-web npm package
NATS2.10+ (2024)JetStream improvementsCompeting alternative to Kafka for lighter workloads

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Services need independent deployment and scalingTeam is <5 engineers or domain is not well understoodModular monolith with clear module boundaries
Different services need different languages/frameworksAll services share the same database anywayMonolith or modular monolith
You need fault isolation (one service failure != total failure)Latency budget is <5ms for the full request pathIn-process function calls (monolith)
Event-driven workflows with multiple independent consumersYou need strict ACID transactions across servicesShared database or distributed transaction coordinator
High-throughput async processing (>10K events/s)Simple CRUD app with <1K usersREST monolith or serverless functions

Important Caveats

Related Units