Outbox Pattern for Reliable Event Publishing from ERP Transactions

Type: ERP Integration System: Cross-ERP (Pattern-level) Confidence: 0.87 Sources: 7 Verified: 2026-03-07 Freshness: 2026-03-07

TL;DR

System Profile

The transactional outbox pattern is an architecture pattern that applies to any system using a relational database for business data that needs to reliably publish events to external consumers. It is the canonical solution for the dual-write problem in distributed systems. This card covers the general pattern, ERP-specific adaptations for Salesforce/SAP/NetSuite/D365, relay mechanisms (polling and CDC), and a decision framework comparing outbox vs CDC vs dual-write vs event sourcing.

SystemRoleAPI SurfaceDirection
ERP Database (source)Business data persistence + outbox tableSQL (same transaction)Source
Relay ProcessReads outbox, publishes to brokerPolling SQL or CDC log tailingBridge
Message Broker (Kafka, RabbitMQ, SQS, Service Bus)Event distribution to consumersBroker-native protocolTarget
Debezium (optional)CDC-based relay -- reads DB transaction logKafka ConnectBridge

API Surfaces & Capabilities

The outbox pattern does not define a specific API surface -- it defines how a database transaction atomically produces both business data and an event record. The relay mechanism bridges the outbox table to the message broker. [src1, src3]

Relay MechanismLatencyInfrastructureDB LoadOrderingComplexity
Polling Publisher1-5s (poll interval)Minimal -- background workerModeratePer-aggregate (sequence_id)Low
Debezium CDCSub-secondKafka + Kafka ConnectMinimal (reads WAL)Per-partition (aggregate_id)Medium
DynamoDB StreamsSub-secondAWS-nativeNone -- built-inPer-partition-keyLow (AWS-only)
Salesforce Platform EventsNear real-timeSF-nativeNone -- built-inPer-replay-IDLow (SF-only)
SAP Event MeshNear real-timeBTPNone -- built-inPer-topicMedium (SAP-only)

Rate Limits & Quotas

Per-System Event Limits

SystemLimit TypeValueNotes
Salesforce Platform EventsStandard-volume events100K/day (Enterprise), 500K/day (Unlimited)High-volume requires add-on license
Salesforce CDCEvent delivery allocationShared with Platform Events daily limit1 event per field change, not per record
SAP Event MeshMessages/month1M (standard), custom enterprise tiersQueue depth: 1,000 messages default
NetSuite SuiteScriptGovernance units per script1,000 (client), 10,000 (server)Each https.request() costs 10 units
D365 Business EventsEndpoint throughputAzure Service Bus limits applyNo D365-specific rate limit
Debezium (Kafka)Throughput100K+ events/secBottleneck is source DB WAL throughput

Outbox Table Performance Limits

MetricRecommended ThresholdImpact of Exceeding
Outbox table size< 100,000 unpublished rowsQuery performance degrades; index bloat
Poll interval1-5 secondsToo low: DB load. Too high: delivery latency
Batch size per poll100-500 rowsToo large: long transactions. Too small: relay overhead
Row retentionDelete after publish + 24h bufferUnbounded growth causes vacuum/analyze issues

Authentication

ComponentAuth MethodNotes
Outbox table (same DB)DB credentials (connection pool)Same transaction context -- no separate auth
Debezium connectorDB replication credentialsRequires REPLICATION role in PostgreSQL
Kafka brokerSASL/SCRAM or mTLSConnector config includes broker credentials
Salesforce Pub/Sub APIOAuth 2.0 JWT bearerFor external subscribers consuming Platform Events
SAP Event MeshOAuth 2.0 client credentialsBTP service binding provides credentials
Azure Service BusManaged Identity or SAS tokenD365 Business Events connect via SAS policy

Authentication Gotchas

Constraints

Integration Pattern Decision Tree

START -- Need reliable event publishing from ERP transactions?
|
+-- Do you control the database schema?
|   |
|   +-- YES (custom DB, self-managed ERP)
|   |   +-- Need sub-second latency?
|   |   |   +-- YES --> Debezium CDC relay with outbox table
|   |   |   +-- NO  --> Polling publisher (simpler, fewer moving parts)
|   |   +-- Already running Kafka?
|   |       +-- YES --> Debezium outbox event router (purpose-built)
|   |       +-- NO  --> Polling publisher to SQS/RabbitMQ/Service Bus
|   |
|   +-- NO (SaaS ERP: Salesforce, NetSuite, D365)
|       +-- Salesforce --> Platform Events + CDC (native outbox equivalent)
|       +-- SAP S/4HANA Cloud --> Event Mesh + Business Event Handling
|       +-- NetSuite --> SuiteScript afterSubmit + custom outbox record
|       +-- D365 F&O --> Business Events + Azure Service Bus
|       +-- D365 CE/Dataverse --> Business Events + Power Automate/Webhooks
|
+-- Pattern comparison:
    +-- Need domain-specific events? --> Outbox pattern
    +-- Need all row changes captured? --> CDC (without outbox)
    +-- Cannot modify application code? --> CDC on existing tables
    +-- Need audit trail of all state? --> Event sourcing

Quick Reference

Outbox Table Schema (PostgreSQL)

ColumnTypePurposeNotes
idUUIDUnique event identifierPK, used for consumer deduplication
sequence_idBIGSERIALRelay orderingMonotonically increasing; DO NOT use created_at
aggregate_typeVARCHAR(255)Entity type (SalesOrder, Invoice)Used for topic routing
aggregate_idUUIDBusiness entity IDKafka partition key; ensures per-entity ordering
event_typeVARCHAR(255)Domain event name (OrderCreated)Consumer uses this for deserialization
payloadJSONBSerialized event dataKeep under 1MB for Kafka compatibility
created_atTIMESTAMPTZEvent timestampNOT used for ordering
published_atTIMESTAMPTZWhen relay published eventNULL = unpublished

Debezium Outbox Event Router Configuration

PropertyDefaultPurpose
transforms.outbox.typeio.debezium.transforms.outbox.EventRouterEnable outbox SMT
route.by.fieldaggregatetypeColumn that determines Kafka topic
route.topic.replacementoutbox.event.${routedByValue}Topic naming pattern
table.field.event.ididUnique event ID column
table.field.event.keyaggregateidKafka message key column
table.field.event.payloadpayloadEvent data column
table.expand.json.payloadfalseExpand JSON strings in payload

Step-by-Step Integration Guide

1. Create the outbox table

Create the outbox table in the same database as your business data. Use BIGSERIAL for reliable ordering and a partial index on unpublished rows for efficient polling. [src4]

CREATE TABLE outbox_events (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sequence_id     BIGSERIAL NOT NULL,
    aggregate_type  VARCHAR(255) NOT NULL,
    aggregate_id    UUID NOT NULL,
    event_type      VARCHAR(255) NOT NULL,
    payload         JSONB NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    published_at    TIMESTAMPTZ
);

-- Partial index: relay queries ONLY unpublished rows
CREATE INDEX idx_outbox_unpublished
    ON outbox_events (sequence_id)
    WHERE published_at IS NULL;

-- Cleanup job index
CREATE INDEX idx_outbox_published_at
    ON outbox_events (published_at)
    WHERE published_at IS NOT NULL;

Verify: SELECT COUNT(*) FROM outbox_events WHERE published_at IS NULL; --> expected: 0

2. Write business data and outbox event in same transaction

The critical guarantee: business data and outbox event are committed atomically. If either fails, both roll back. [src1]

import uuid, json
from datetime import datetime

def create_order_with_event(conn, order_data):
    """Write order + outbox event in single transaction."""
    order_id = uuid.uuid4()
    event_id = uuid.uuid4()

    with conn.cursor() as cur:
        cur.execute("""
            INSERT INTO sales_orders (id, customer_id, total, status)
            VALUES (%s, %s, %s, 'created')
        """, (str(order_id), order_data['customer_id'], order_data['total']))

        cur.execute("""
            INSERT INTO outbox_events (id, aggregate_type, aggregate_id,
                                       event_type, payload)
            VALUES (%s, %s, %s, %s, %s)
        """, (str(event_id), 'SalesOrder', str(order_id), 'OrderCreated',
              json.dumps({
                  'order_id': str(order_id),
                  'customer_id': order_data['customer_id'],
                  'total': str(order_data['total']),
                  'timestamp': datetime.utcnow().isoformat()
              })))
    conn.commit()
    return order_id

Verify: SELECT COUNT(*) FROM outbox_events WHERE published_at IS NULL; --> expected: 1

3. Implement the polling publisher relay

Background process polls outbox, publishes to broker, marks as published. Use FOR UPDATE SKIP LOCKED for safe concurrent relay instances. [src4]

from confluent_kafka import Producer
import json, time

def poll_and_publish(conn, producer, batch_size=100, poll_interval=2.0):
    while True:
        with conn.cursor() as cur:
            cur.execute("""
                SELECT id, aggregate_type, aggregate_id, event_type, payload
                FROM outbox_events WHERE published_at IS NULL
                ORDER BY sequence_id ASC LIMIT %s
                FOR UPDATE SKIP LOCKED
            """, (batch_size,))
            rows = cur.fetchall()
            if not rows:
                time.sleep(poll_interval)
                continue

            published_ids = []
            for row in rows:
                event_id, agg_type, agg_id, event_type, payload = row
                producer.produce(
                    topic=f"outbox.event.{agg_type}",
                    key=str(agg_id),
                    value=json.dumps(payload),
                    headers={'event_id': str(event_id), 'event_type': event_type}
                )
                published_ids.append(str(event_id))
            producer.flush()
            cur.execute("""
                UPDATE outbox_events SET published_at = NOW()
                WHERE id = ANY(%s)
            """, (published_ids,))
        conn.commit()
        time.sleep(poll_interval)

Verify: SELECT COUNT(*) FROM outbox_events WHERE published_at IS NULL; --> expected: 0

4. Configure Debezium CDC relay (alternative to polling)

Use Debezium's outbox event router for sub-second relay without polling overhead. [src2]

{
  "name": "outbox-connector",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "erp-db.example.com",
    "database.port": "5432",
    "database.user": "debezium_replication",
    "database.dbname": "erp_production",
    "topic.prefix": "erp",
    "table.include.list": "public.outbox_events",
    "transforms": "outbox",
    "transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter",
    "transforms.outbox.route.by.field": "aggregate_type",
    "transforms.outbox.route.topic.replacement": "outbox.event.${routedByValue}",
    "transforms.outbox.table.expand.json.payload": "true",
    "plugin.name": "pgoutput",
    "slot.name": "outbox_slot"
  }
}

Verify: curl http://kafka-connect:8083/connectors/outbox-connector/status --> expected: {"connector":{"state":"RUNNING"}}

5. Implement outbox table cleanup

Published events must be cleaned up to prevent unbounded table growth. [src4]

-- Delete published events older than 7 days
DELETE FROM outbox_events
WHERE published_at IS NOT NULL
  AND published_at < NOW() - INTERVAL '7 days';

Verify: SELECT pg_size_pretty(pg_total_relation_size('outbox_events')); --> size should remain stable

Code Examples

Python: Idempotent consumer with deduplication

# Input:  Kafka message from outbox relay
# Output: Processed event with duplicate detection

def process_event(conn, message):
    """Idempotent event consumer -- deduplicates by event_id."""
    event_id = dict(message.headers()).get('event_id', b'').decode()
    payload = json.loads(message.value())

    with conn.cursor() as cur:
        cur.execute("SELECT 1 FROM processed_events WHERE event_id = %s", (event_id,))
        if cur.fetchone():
            return  # Skip duplicate
        handle_order_created(payload)
        cur.execute(
            "INSERT INTO processed_events (event_id, processed_at) VALUES (%s, NOW())",
            (event_id,))
    conn.commit()

JavaScript/Node.js: Outbox write with Knex.js

// Input:  Order data + Knex transaction
// Output: Order + outbox event committed atomically

const { v4: uuidv4 } = require('uuid'); // [email protected]

async function createOrderWithOutbox(knex, orderData) {
  return knex.transaction(async (trx) => {
    const orderId = uuidv4();
    await trx('sales_orders').insert({
      id: orderId, customer_id: orderData.customerId,
      total: orderData.total, status: 'created'
    });
    await trx('outbox_events').insert({
      id: uuidv4(), aggregate_type: 'SalesOrder',
      aggregate_id: orderId, event_type: 'OrderCreated',
      payload: JSON.stringify({
        order_id: orderId, customer_id: orderData.customerId,
        total: orderData.total
      })
    });
    return orderId;
  });
}

Salesforce Apex: Platform Events as outbox equivalent

// Platform Events participate in trigger transaction context
// If the transaction rolls back, events are NOT published

trigger OpportunityClosedWon on Opportunity (after update) {
    List<Order_Event__e> events = new List<Order_Event__e>();
    for (Opportunity opp : Trigger.new) {
        Opportunity oldOpp = Trigger.oldMap.get(opp.Id);
        if (opp.StageName == 'Closed Won' && oldOpp.StageName != 'Closed Won') {
            events.add(new Order_Event__e(
                Order_Id__c = opp.Id, Account_Id__c = opp.AccountId,
                Amount__c = opp.Amount, Event_Type__c = 'OpportunityClosedWon'
            ));
        }
    }
    if (!events.isEmpty()) {
        List<Database.SaveResult> results = EventBus.publish(events);
    }
}

Data Mapping

Outbox Event Schema Mapping Across ERPs

ERP Source Eventaggregate_typeevent_typeKey Payload FieldsGotcha
Salesforce Opportunity closed-wonSalesOrderOrderCreatedopportunity_id, account_id, amount, currencyAmount in org's default currency unless multi-currency
SAP Sales Order (VA01)SalesOrderSAPOrderCreatedVBELN, KUNNR, NETWR, WAERKAmounts in smallest currency unit (cents)
NetSuite Sales Order createdSalesOrderNSOrderCreatedinternalid, entity, total, subsidiaryEntity ID is subsidiary-scoped in multi-sub orgs
D365 Sales Order confirmedSalesOrderD365OrderConfirmedSalesOrderNumber, OrderAccount, InvoiceAccountOrderAccount vs InvoiceAccount can differ

Data Type Gotchas

Error Handling & Failure Points

Common Error Scenarios

ScenarioImpactResolution
Relay crashes after Kafka publish, before marking row publishedDuplicate event on next relay cycleConsumer idempotency (deduplication by event_id)
Database transaction timeout during outbox writeBoth business data and event rolled back (safe)Retry the business operation
Kafka broker unavailable during relayEvents accumulate in outbox tableRelay retries with exponential backoff; alert if outbox depth > threshold
Debezium replication slot droppedEvents between slot drop and recreation lost permanentlyMonitor WAL lag; alert on slot state changes
Consumer processing failsEvent stays in consumer's dead letter queueDLQ with manual review + retry mechanism
Outbox table bloat (no cleanup)Increasing query latency, disk usageScheduled cleanup job (delete published events > retention window)

Failure Points in Production

Anti-Patterns

Wrong: Dual-write -- publish event directly after DB commit

# BAD -- event can be lost if publish fails after commit
def create_order_bad(conn, producer, order_data):
    order_id = save_order_to_db(conn, order_data)  # DB commit
    producer.produce('orders', json.dumps({'order_id': order_id}))
    # If this fails (network error, broker down), event is LOST

Correct: Outbox -- event written in same transaction

# GOOD -- event is guaranteed to be published eventually
def create_order_good(conn, order_data):
    with conn.cursor() as cur:
        order_id = uuid.uuid4()
        cur.execute("INSERT INTO sales_orders ...", (order_id, ...))
        cur.execute("INSERT INTO outbox_events ...", (order_id, 'OrderCreated', ...))
    conn.commit()  # Atomic -- both or neither

Wrong: Publishing inside the transaction (phantom events)

# BAD -- if publish succeeds but transaction rolls back,
# consumers act on phantom events
def create_order_phantom(conn, producer, order_data):
    conn.begin()
    order_id = insert_order(conn, order_data)
    producer.produce('orders', json.dumps({'order_id': order_id}))
    conn.commit()  # If this fails, event was already published!

Correct: Publish only from the relay, never from the application

# GOOD -- application writes to outbox only
def create_order_correct(conn, order_data):
    conn.begin()
    order_id = insert_order(conn, order_data)
    insert_outbox_event(conn, 'OrderCreated', order_id, order_data)
    conn.commit()
    # Relay process handles broker publishing separately

Wrong: Using timestamps for relay ordering

-- BAD -- concurrent transactions produce out-of-order timestamps
SELECT * FROM outbox_events WHERE published_at IS NULL
ORDER BY created_at ASC;

Correct: Using monotonic sequence for ordering

-- GOOD -- BIGSERIAL guarantees monotonic ordering
SELECT * FROM outbox_events WHERE published_at IS NULL
ORDER BY sequence_id ASC
FOR UPDATE SKIP LOCKED;

Common Pitfalls

Diagnostic Commands

# Check outbox depth (unpublished events) -- should be near 0
psql -c "SELECT COUNT(*) as pending, MIN(created_at) as oldest_pending FROM outbox_events WHERE published_at IS NULL;"

# Check outbox table size
psql -c "SELECT pg_size_pretty(pg_total_relation_size('outbox_events'));"

# Check Debezium connector status
curl -s http://kafka-connect:8083/connectors/outbox-connector/status | jq '.connector.state'

# Check PostgreSQL replication slot lag
psql -c "SELECT slot_name, active, pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) as lag_bytes FROM pg_replication_slots;"

# Check Kafka consumer group lag
kafka-consumer-groups.sh --bootstrap-server kafka:9092 --group outbox-consumer --describe

# Check Salesforce Platform Event usage
curl -H "Authorization: Bearer $SF_TOKEN" \
  "https://yourorg.my.salesforce.com/services/data/v62.0/limits" | jq '.DailyStandardVolumePlatformEvents'

# Monitor relay throughput (events/minute)
psql -c "SELECT date_trunc('minute', published_at) as minute, COUNT(*) FROM outbox_events WHERE published_at > NOW() - INTERVAL '1 hour' GROUP BY 1 ORDER BY 1;"

Version History & Compatibility

ComponentVersionStatusNotes
Debezium Outbox Event Router2.xCurrent (GA)Stable since 1.9; MongoDB requires MongoEventRouter
Debezium Outbox Event Router1.xDeprecatedUpgrade to 2.x for improved JSON expansion
Salesforce Platform EventsAPI v62.0 (Spring '26)CurrentHigh-volume events GA since Winter '22
Salesforce CDCAPI v62.0 (Spring '26)CurrentEnriched change events since Summer '23
SAP Event MeshBTP 2024CurrentAdvanced Event Mesh is successor (Solace-based)
D365 Business Events2024 Release Wave 2CurrentAvailable in F&O and CE/Dataverse

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Business transaction must guarantee downstream event deliverySimple request-reply with no delivery guarantee neededDirect REST API call with retry
Source system uses a relational databaseSource is a SaaS ERP with no DB accessERP-native events (Platform Events, SuiteScript)
Need domain events (OrderCreated, InvoicePosted)Need ALL row-level changes regardless of business meaningCDC without outbox (Debezium on business tables)
Multiple consumers need the same event (fan-out)Single consumer, point-to-point integrationDirect API call or simple queue
Already running Kafka and want sub-second relayNo Kafka and no plans to adopt itPolling relay to SQS or RabbitMQ
Need event replay for debugging or rebuilding stateEvents are ephemeral notifications with no replay valueFire-and-forget webhooks

Cross-System Comparison

CapabilityOutbox + PollingOutbox + Debezium CDCDual-Write (no outbox)CDC on Business TablesEvent Sourcing
AtomicityGuaranteed (same DB txn)Guaranteed (same DB txn)NOT guaranteedN/A (reads existing txn log)Guaranteed (event IS the write)
Delivery guaranteeAt-least-onceAt-least-onceAt-most-onceAt-least-onceAt-least-once
Latency1-5s (poll interval)Sub-secondNear real-timeSub-secondSub-second
DB loadModerate (polling)Minimal (reads WAL)NoneMinimal (reads WAL)None
InfrastructureBackground workerKafka + Kafka ConnectNone additionalKafka + Kafka ConnectEvent store
Event semanticsDomain events (explicit)Domain events (explicit)App-level (fragile)Row-level (implicit)Domain events (inherent)
Code changesWrite to outboxWrite to outboxPublish to brokerNONE (non-intrusive)Complete rewrite
OrderingPer-aggregatePer-partitionNo guaranteePer-tablePer-aggregate
ReplayYes (retain rows)Yes (Kafka retention)NoYes (Kafka retention)Yes (event store)
Best forSimple setups, no KafkaProduction-grade, KafkaPrototype onlyLegacy, no code changesGreenfield, complex domain

Important Caveats

Related Units