Batch vs Real-Time vs Event-Driven Integration Patterns for ERPs
When should you use batch vs real-time vs event-driven integration patterns for ERPs?
TL;DR
Bottom line: Use real-time only when a user is actively waiting or a financial/compliance decision depends on live data; use event-driven for near-real-time change propagation with loose coupling; use batch for everything else — it is cheaper, more reliable, and simpler to operate.
Key limit: Real-time (synchronous) integration has hard latency ceilings — Salesforce Apex callouts timeout at 120s, SAP RFC calls tie up dialog work processes, and all systems throttle API calls per 24h window.
Watch out for: Defaulting to real-time integration for everything — tight coupling means an ERP outage cascades to every connected system.
Best for: Architecture decisions at the start of an ERP integration project, middleware selection, and pattern assignment per data flow.
Authentication: Pattern-independent — all three patterns use the same auth flows (OAuth 2.0, JWT, TBA, certificates) as the underlying API surface.
System Profile
This card is a cross-platform architecture pattern guide covering the three fundamental integration timing patterns — batch (scheduled), real-time (synchronous), and event-driven (asynchronous near-real-time) — mapped to concrete capabilities across Salesforce, SAP S/4HANA, Oracle ERP Cloud, Microsoft Dynamics 365, NetSuite, and Workday.
Property
Value
Scope
Cross-platform architecture pattern
Systems Covered
Salesforce, SAP S/4HANA, Oracle ERP Cloud, Dynamics 365, NetSuite, Workday
Order status propagation, record change sync, notifications
Near-real-time batch
5-15 min
1 min (micro-batch)
1 hour
Dashboard updates, lead scoring, non-critical status sync
Scheduled batch
1-24 hours
15 min
24+ hours
Financial reporting, data warehouse loads, reference data
File-based import
30 min-24 hours
15 min
Days (manual review)
Data migration, initial loads, regulatory reporting
Authentication
Authentication is pattern-independent — the same auth flows apply regardless of integration timing pattern.
Pattern
Auth Consideration
Gotcha
Real-time
Token must be cached and refreshed proactively
OAuth token expiry mid-transaction can cause silent failures
Batch/Bulk
Service account with elevated permissions
Session timeout during long-running batch jobs
Event-driven
Subscription credentials must be long-lived and auto-renewing
Salesforce Pub/Sub API requires gRPC with OAuth
Authentication Gotchas
Real-time integrations that refresh OAuth tokens inline add 200-500ms per call — cache tokens and refresh asynchronously before expiry. [src1]
Batch service accounts in Salesforce run under a dedicated Integration User — governor limits are per-transaction, not per-user. [src1]
Event subscriptions using short-lived tokens will silently disconnect when the token expires — implement automatic reconnection with replay ID. [src1]
Constraints
Real-time creates tight coupling: If the target ERP is slow or unavailable, the calling system blocks or fails. Design circuit breakers and fallback paths for every synchronous integration. [src3, src4]
Batch data is always stale: Batch windows mean consuming systems operate on data that is minutes to hours old. Never use batch for inventory availability, credit decisions, or compliance checks. [src3]
Event-driven requires infrastructure: You need a message broker (Kafka, RabbitMQ), event mesh (SAP Event Mesh), or platform-native events (Salesforce Platform Events). [src2]
Shared event allocations: Salesforce CDC and Platform Events share a daily delivery allocation (50K base). A large batch data load can exhaust event capacity. [src1]
Event replay has a finite window: Salesforce Platform Events retain for 72 hours; CDC retains for 3 days. Events outside the window are permanently lost. [src1]
Cloud ERPs restrict legacy patterns: SAP S/4HANA Cloud does not support RFC/BAPI; Oracle ERP Cloud limits SOAP API availability. [src6]
Bulk API limits cap throughput: Salesforce Bulk API 2.0 is capped at 15,000 batches/24h. Plan multi-day runs for very large migrations. [src1]
Integration Pattern Decision Tree
Use this decision tree to assign the correct integration pattern for each data flow. [src1, src4]
START — Assign integration pattern for a specific data flow
├── Is someone actively waiting for the result?
│ ├── YES — Is it a financial-impact or compliance decision?
│ │ ├── YES → REAL-TIME (synchronous API call)
│ │ │ Examples: credit check, payment authorization, inventory reservation
│ │ └── NO — Would a 5-minute delay cause genuine business impact?
│ │ ├── YES → REAL-TIME (synchronous API call)
│ │ └── NO → EVENT-DRIVEN (near-real-time async)
│ └── NO — Is the data flow triggered by a record change?
│ ├── YES — Do multiple downstream systems need this change?
│ │ ├── YES → EVENT-DRIVEN (pub/sub pattern)
│ │ └── NO → EVENT-DRIVEN or BATCH (based on volume)
│ │ ├── < 1,000 records/day → EVENT-DRIVEN
│ │ └── > 1,000 records/day → BATCH
│ └── NO — Is it a scheduled data synchronization?
│ ├── YES — Volume per run?
│ │ ├── < 2,000 records → REST API (batch-style)
│ │ ├── 2,000-150,000 records → BULK API (single job)
│ │ ├── > 150,000 records → BULK API with job chunking or FILE-BASED
│ │ └── Full data migration → FILE-BASED IMPORT
│ └── NO → Analyze case-by-case
├── Error tolerance?
│ ├── Zero-loss required → EVENT-DRIVEN with replay + dead letter queue
│ ├── Retry acceptable → EVENT-DRIVEN with at-least-once delivery
│ └── Best-effort → BATCH with reconciliation report
└── Bidirectional sync needed?
├── YES → Design conflict resolution strategy FIRST
└── NO → Proceed with chosen pattern
Quick Reference
Pattern Comparison Summary
Criterion
Real-Time (Synchronous)
Event-Driven (Async)
Batch (Scheduled)
Latency
50ms-2s
1s-60s
Minutes to hours
Coupling
Tight — both systems must be available
Loose — broker decouples systems
Loose — no runtime dependency
Throughput
Low (API call per record)
Medium (event stream)
High (bulk processing)
Complexity
Low (simple API call)
High (broker, replay, ordering)
Medium (scheduling, monitoring)
Error handling
Immediate — caller gets error response
Deferred — dead letter queue, replay
Deferred — reconciliation reports
Data freshness
Real-time (current)
Near-real-time (seconds behind)
Stale (minutes to hours behind)
Infrastructure
API endpoint only
Message broker / event mesh
Job scheduler, bulk API access
Cost (relative)
High (per-call API charges, always-on)
Medium (broker licensing, compute)
Low (off-peak compute, batched calls)
Reliability
Low (cascading failures)
High (broker persistence, replay)
High (retry entire batch, reconcile)
Best for
User-facing, low-volume, critical decisions
Multi-system sync, change propagation
High-volume loads, reporting, migrations
Worst for
High-volume data loads, background jobs
Simple point-to-point, low frequency
Time-sensitive operations, user-facing
Pattern-to-ERP Surface Mapping
Pattern
Salesforce
SAP S/4HANA
Oracle ERP Cloud
Dynamics 365
NetSuite
Real-time
REST/Composite API
OData v4, BAPI (on-prem)
REST API
Dataverse Web API
SuiteTalk REST, RESTlets
Event-driven
CDC, Platform Events, Pub/Sub API
Event Mesh, Business Events
Business Events, OIC
Business Events, Webhooks
User Event Scripts
Batch/Bulk
Bulk API 2.0
IDoc, BTP Integration
FBDI, ESS jobs
DMF, Dual Write
CSV Import, SuiteQL
File-based
Data Loader CSV
BTP file upload
FBDI (CSV/XML)
DMF packages
CSV Import
Step-by-Step Integration Guide
1. Audit and categorize all data flows
Enumerate every integration point. For each, capture: direction, volume, latency requirement, and business criticality. [src4]
# Integration audit spreadsheet structure
Flow ID | Source System | Target System | Direction | Records/Day | Latency Need | Business Impact | Pattern
F-001 | Salesforce | NetSuite | Outbound | 500 | < 5 min | High (revenue) | Event-Driven
F-002 | Warehouse | Salesforce | Inbound | 50,000 | Nightly OK | Medium | Batch
F-003 | Website | ERP | Inbound | 200 | < 1s | Critical (order)| Real-Time
Verify: Every integration point has a pattern assignment. No flow should be left as "TBD."
2. Apply the decision tree to each flow
Walk each flow through the decision tree. The four questions: (1) Is someone waiting? (2) Is it change-triggered? (3) What is the volume? (4) What is the error tolerance? [src4]
3. Design error handling per pattern
Each pattern requires a different error handling strategy: circuit breakers for real-time, dead letter queues for event-driven, reconciliation reports for batch. [src2, src3]
4. Validate rate limit headroom
Calculate daily API consumption per pattern and compare against ERP limits. Leave at least 30% headroom for spikes. [src1]
Code Examples
Python: Event-driven integration with retry and dead letter queue
# Input: Event payload from message broker (Kafka, RabbitMQ, or platform events)
# Output: Processed event or dead-letter routing
import time
import logging
MAX_RETRIES = 5
BACKOFF_BASE = 2 # seconds
def process_event_with_retry(event, target_api_client):
event_id = event.get("event_id", "unknown")
for attempt in range(MAX_RETRIES):
try:
result = target_api_client.upsert(
object_type=event["object_type"],
external_id_field=event["external_id_field"],
external_id=event["external_id"],
payload=event["data"]
)
return {"status": "success", "result": result}
except RateLimitError:
time.sleep(BACKOFF_BASE ** attempt)
except PermanentError as e:
return route_to_dead_letter(event, str(e))
return route_to_dead_letter(event, f"Exhausted {MAX_RETRIES} retries")
JavaScript/Node.js: Batch integration with chunking
// Input: Array of records to upsert via Bulk API
// Output: Job status with success/failure counts
const CHUNK_SIZE = 10000;
const MAX_CONCURRENT_JOBS = 3;
async function batchUpsert(records, bulkClient, objectType) {
const chunks = [];
for (let i = 0; i < records.length; i += CHUNK_SIZE) {
chunks.push(records.slice(i, i + CHUNK_SIZE));
}
const results = { success: 0, failed: 0, errors: [] };
for (let i = 0; i < chunks.length; i += MAX_CONCURRENT_JOBS) {
const batch = chunks.slice(i, i + MAX_CONCURRENT_JOBS);
const promises = batch.map(chunk =>
bulkClient.createJob({ object: objectType, operation: "upsert",
externalIdFieldName: "External_ID__c", data: chunk
}).then(job => pollJobCompletion(bulkClient, job.id))
);
const batchResults = await Promise.allSettled(promises);
for (const r of batchResults) {
if (r.status === "fulfilled") {
results.success += r.value.numberRecordsProcessed;
results.failed += r.value.numberRecordsFailed;
} else { results.errors.push(r.reason.message); }
}
}
return results;
}
cURL: Test event subscription (Salesforce CDC via CometD)
Batch integrations must handle full-vs-delta loads — always prefer delta (changed records only). [src1]
Event-driven integrations receive changes but not the full record — subscribers may need to call back for related data. [src1]
Real-time integrations across time zones must normalize timestamps — Salesforce stores UTC, SAP stores in user timezone. [src3]
Error Handling & Failure Points
Common Error Patterns by Integration Type
Pattern
Error Type
Frequency
Impact
Resolution
Real-time
429 Rate Limit
Common at scale
Blocked transactions
Exponential backoff; move to batch if persistent
Real-time
Timeout (504/408)
Occasional
Hung transactions
Circuit breaker; async fallback
Real-time
ERP down (503)
Rare but critical
Cascading failure
Circuit breaker; queue and retry
Event-driven
Missed events
Rare
Data inconsistency
Replay from last known ID; periodic reconciliation
Event-driven
Duplicate events
Common
Duplicate records
Idempotent receivers (external ID-based upsert)
Event-driven
Out-of-order events
Common
Data corruption
Sequence numbers; last-modified-wins logic
Batch
Partial failure
Common
Incomplete sync
Per-record error logging; retry failed subset
Batch
Job timeout
Occasional
No data synced
Chunk into smaller jobs; extend batch window
Failure Points in Production
Real-time cascading failure: ERP response times degrade from 200ms to 30s under load. Fix: Circuit breaker with 5-failure threshold and 60s recovery window. [src3]
Event allocation exhaustion: Batch data load triggers CDC events that consume entire 50K/day allocation. Fix: Disable CDC on objects during bulk loads, or purchase high-volume add-on. [src1]
Replay window expiry: Subscriber offline for 4 days; Platform Events only retain for 72 hours. Fix: Periodic full reconciliation independent of event stream. [src1]
Batch window overrun: Nightly job grows from 2h to 8h as data volume increases. Fix: Monitor duration trends; adaptive chunking; hard cutoff with checkpoint resume. [src5]
Idempotency failure in event processing: Event delivered twice creates duplicate record. Fix: Always use upsert with external ID; never use insert for event-driven. [src2]
Anti-Patterns
Wrong: Defaulting to real-time for all integrations
# BAD — synchronous API call for 50K-record nightly product catalog sync
for product in all_products: # 50,000 products
response = erp_api.update_product(product) # 1 API call per product
if response.status_code == 429:
time.sleep(60) # Takes days to complete
Correct: Use batch/bulk for high-volume scheduled operations
# GOOD — Bulk API for the same 50K-record sync
for chunk in chunks(all_products, 10000):
csv_data = build_csv(chunk)
job = bulk_client.create_job("Product2", "upsert", "External_ID__c")
bulk_client.upload_data(job.id, csv_data)
bulk_client.close_job(job.id)
# 5 bulk jobs, completes in minutes
Wrong: Using batch for time-sensitive financial decisions
# BAD — hourly batch check for credit limits
# Between batches, exceeded-credit customers can still place orders
def check_credit(customer_id, order_amount):
cached_limit = cache.get(f"credit:{customer_id}") # Up to 1 hour stale
return order_amount <= cached_limit
Correct: Use real-time for financial-impact decisions
# GOOD — real-time credit check at point of order
def check_credit(customer_id, order_amount):
response = erp_api.get(f"/accounts/{customer_id}/credit_status")
if response.status_code != 200:
return False # Fail-closed: deny if ERP unreachable
credit = response.json()
return order_amount <= (credit["limit"] - credit["used"])
Wrong: Event-driven without idempotency
# BAD — INSERT on every event (duplicates on redelivery)
def handle_order_event(event):
db.execute("INSERT INTO orders (erp_id, amount) VALUES (?, ?)",
event["order_id"], event["amount"])
Correct: Event-driven with idempotent upsert
# GOOD — UPSERT on external ID (safe for redelivery)
def handle_order_event(event):
db.execute("""INSERT INTO orders (erp_id, amount, last_event_ts)
VALUES (?, ?, ?)
ON CONFLICT (erp_id) DO UPDATE SET amount = EXCLUDED.amount,
last_event_ts = EXCLUDED.last_event_ts
WHERE orders.last_event_ts < EXCLUDED.last_event_ts""",
event["order_id"], event["amount"], event["timestamp"])
Common Pitfalls
Treating "real-time" as a default: 80% of integration flows work fine with batch or event-driven, at lower cost and higher reliability. Fix: Require justification: "Who is waiting, and what happens if they wait 5 minutes?". [src4]
Ignoring batch window growth: Jobs that take 30 minutes today will take 3 hours in a year as data grows. Fix: Monitor batch duration weekly; implement adaptive chunking with hard cutoff times. [src5]
No reconciliation layer: Event-driven treated as sole source of truth; missed events cause silent data divergence. Fix: Daily full reconciliation independent of event stream. [src2]
Mixing patterns on the same data object: Different patterns have different ordering guarantees. Fix: One primary pattern per data object per direction; implement sequence numbers if mixing. [src1]
Undersizing event infrastructure: Default allocations exceeded at peak load. Fix: Load-test at 3x expected peak; provision based on peak, not average. [src2]
Not accounting for API limit sharing: Real-time and batch API calls share the same daily quota. Fix: Reserve 30% for real-time; use Bulk API (separate limit pool) for batch. [src1]
CDC for record changes, Platform Events for custom
Event Mesh (cloud-first)
Business Events + OIC subscriptions
Dataverse webhooks
User Event Scripts
Hybrid maturity
High
High (Cloud), Medium (ECC)
Medium-High
Medium-High
Medium
Important Caveats
Rate limits and event allocations vary dramatically by ERP edition and licensing tier — always verify against your specific contract and current release notes.
"Real-time" in ERP integration typically means 200ms-2s latency, not sub-millisecond — if you need <50ms, consider caching or data virtualization.
Event-driven patterns provide eventual consistency, not strong consistency — if your business process requires ACID guarantees across systems, use the saga pattern.
Cloud ERP editions increasingly restrict legacy integration surfaces (RFC, BAPI, SOAP) — verify that your chosen pattern is supported in your target deployment model.
The hybrid approach (real-time + event-driven + batch) is the industry consensus for 2025-2026, but introduces three different error handling strategies to manage. Budget accordingly.
Data integration market projected to reach $30.27B by 2030 from $15.18B in 2024 — pattern selection is becoming a strategic capability, not just a technical decision.