Change Data Capture (CDC) for ERP Integration — Debezium, GoldenGate, and Cloud ERP Patterns
How does Change Data Capture work for ERP integration - Debezium, GoldenGate, cloud ERP limitations?
TL;DR
Bottom line: Log-based CDC (Debezium, GoldenGate) is the gold standard for on-premise ERP databases — reads transaction logs with near-zero impact. Cloud SaaS ERPs (Salesforce, Workday, NetSuite) block log access entirely; you must use vendor-native event streams or API-based CDC instead.
Key limit: Cloud ERPs cannot do log-based CDC — Salesforce retains CDC events for only 3 days, SAP requires SLT proxy for table-level CDC, and NetSuite/Workday offer no native CDC at all.
Watch out for: Attempting log-based CDC on a SaaS ERP you do not control — it is architecturally impossible. Also, GoldenGate licensing ($17,500/processor on both source AND target) catches teams off guard.
Best for: Real-time, event-driven replication from ERP databases where sub-second latency and zero-loss delivery matter more than implementation simplicity.
Authentication: N/A at CDC level — Debezium uses DB credentials; GoldenGate uses OS-level DB access; Salesforce CDC uses OAuth 2.0 via Pub/Sub API.
System Profile
This card covers Change Data Capture as an integration architecture pattern across multiple ERP systems. It compares log-based CDC tools (Debezium, Oracle GoldenGate), vendor-native CDC (Salesforce CDC, SAP ODP/SLT), and explains why cloud SaaS ERPs fundamentally limit what CDC methods are available. CDC method availability depends entirely on whether you have database-level access.
System
CDC Method Available
Tool
DB Access Required?
Latency
Oracle EBS (on-prem)
Log-based (redo logs)
GoldenGate, Debezium
Yes — supplemental logging
Sub-second
SAP S/4HANA (on-prem)
Log-based + ODP
SLT, Debezium, ADF SAP CDC
Yes (DB) or No (ODP via RFC)
Seconds
SAP S/4HANA Cloud
ODP only
Azure Data Factory SAP CDC
No — ODP/RFC only
Seconds to minutes
Salesforce
Vendor-native event stream
Salesforce CDC (Pub/Sub API)
No — no DB access
Sub-second
NetSuite
None native; API polling only
Custom (SuiteTalk/REST)
No — no DB access
Minutes
Workday
None native; API-based only
Workday RaaS
No — no DB access
Minutes to hours
Dynamics 365
Dataverse Change Tracking
Dataverse API / Azure Synapse Link
No — API-level only
Minutes
Custom ERP (PostgreSQL)
Log-based (WAL)
Debezium
Yes — logical replication
Sub-second
Custom ERP (MySQL)
Log-based (binlog)
Debezium
Yes — binlog access
Sub-second
API Surfaces & Capabilities
CDC Tool/Method
Protocol
Best For
Max Throughput
Latency
Open Source?
Cost
Debezium (Kafka Connect)
Kafka / HTTP (Server)
On-prem DB CDC at scale
100K+ events/sec
Sub-second
Yes (Apache 2.0)
Infrastructure only
Oracle GoldenGate
Trail files / REST API
Oracle-to-Oracle, high-volume
100K+ events/sec
Sub-second
No
$17,500/processor
Salesforce CDC
Pub/Sub gRPC / CometD
SF record change tracking
Edition-dependent
Sub-second
N/A (platform feature)
Included in Enterprise+
SAP ODP/SLT
RFC / OData
SAP table and CDS extraction
SLT sizing dependent
Seconds
No
SAP licensing
ADF SAP CDC
ODP via RFC
SAP-to-Azure delta extraction
IR sizing dependent
Minutes
No
ADF pricing
Query-based polling
REST/SOAP API
Simple, low-volume, any ERP
API rate limited
Minutes to hours
N/A
API call costs
Trigger-based CDC
DB triggers
Legacy, no log access
Low (trigger overhead)
Seconds
N/A
Dev cost
Rate Limits & Quotas
Debezium Limits
Limit Type
Value
Applies To
Notes
Max connectors per cluster
No hard limit
Kafka Connect
Bounded by cluster resources
Kafka message max size
1 MB default
All connectors
Configure max.message.bytes for large rows
Heartbeat interval
300,000 ms default
All connectors
Reduce for low-traffic tables
Oracle LogMiner batch
10,000 rows default
Oracle connector
log.mining.batch.size.default
Slot replication lag
Monitor required
PostgreSQL
Unbounded WAL growth if behind
Oracle GoldenGate Limits
Limit Type
Value
Applies To
Notes
Trail file size
2 GB default
Extract/Replicat
Configurable; split large transactions
Supplemental logging overhead
5-15% write increase
Source Oracle DB
Required — cannot be avoided
Max transaction size
Limited by trail disk
Large batch ops
>10M row transactions need tuning
Salesforce CDC Limits
Limit Type
Value
Window
Edition Differences
Event retention
3 days
Rolling
Same across all editions
Max entities per channel
5 (custom channel)
Per channel
Standard channel covers all
CometD connections
Edition-based
Concurrent
Enterprise: 2,000 clients
SAP ODP/SLT Limits
Limit Type
Value
Applies To
Notes
ODP delta queue retention
Configurable (default 24h)
All subscribers
Old deltas purged after consumption
SLT replication tables
Resource bound
SLT server
Each table requires logging table
Authentication
Tool
Auth Method
Credentials
Refresh?
Notes
Debezium (PostgreSQL)
DB user + REPLICATION role
Username/password or SSL
N/A
Dedicated replication user
Debezium (MySQL)
REPLICATION SLAVE + CLIENT
Username/password
N/A
Also needs SELECT
Debezium (Oracle)
LogMiner privileges
Username/password
N/A
V$LOG, V$LOGFILE access
Oracle GoldenGate
OS + GG credential store
DB + GG admin creds
N/A
Extract runs as OS user
Salesforce CDC
OAuth 2.0 (JWT/Web Server)
Connected App + cert
Yes (2h)
Pub/Sub API or CometD
SAP ODP (via ADF)
SAP RFC user
SAP username/password
N/A
Self-hosted IR required
Authentication Gotchas
Debezium PostgreSQL: replication user needs REPLICATION role AND LOGIN privilege. Abandoned slots cause WAL disk exhaustion. [src1]
GoldenGate: forgetting table-level supplemental logging causes silent data loss — updates captured without full column context. [src3]
Salesforce: OAuth tokens expire; Pub/Sub subscriber disconnection risks data loss beyond 3-day retention window. [src5]
Constraints
Cloud SaaS ERPs block log-based CDC entirely: Salesforce, Workday, NetSuite do not expose transaction logs. Debezium/GoldenGate are architecturally impossible.
GoldenGate dual-processor licensing: License required on BOTH source and target processors. 8 processors = $140,000 list price before support.
Salesforce CDC 3-day retention: Events not consumed within 3 days are permanently deleted. Downtime > 3 days requires full re-sync.
Debezium requires Kafka or Debezium Server: Minimum viable Kafka cluster: 3 brokers + KRaft. Debezium Server needs a sink target.
Oracle supplemental logging overhead: 5-15% redo log volume increase. Impacts I/O and archive storage on high-write databases.
SAP SLT licensing: SLT is a separate product — not included in base S/4HANA license. Required for application table CDC.
PostgreSQL WAL retention risk: Debezium replication slot prevents WAL cleanup. Connector downtime can fill disk and crash the database.
Integration Pattern Decision Tree
START — Need CDC from an ERP system
├── Do you have direct database access?
│ ├── YES (on-premise / IaaS with DB admin)
│ │ ├── Oracle → Debezium (LogMiner) or GoldenGate
│ │ │ ├── Budget allows $17,500+/processor? → GoldenGate (best Oracle native)
│ │ │ └── No / heterogeneous targets → Debezium (free, Kafka ecosystem)
│ │ ├── PostgreSQL → Debezium (logical replication, WAL)
│ │ ├── MySQL/MariaDB → Debezium (binlog)
│ │ ├── SQL Server → Debezium (SQL Server CDC feature)
│ │ └── Db2 → Debezium (Db2 connector)
│ └── NO (SaaS / cloud ERP, no DB access)
│ ├── Salesforce → Salesforce CDC (Pub/Sub API)
│ ├── SAP S/4HANA Cloud → ODP via ADF SAP CDC connector
│ ├── SAP ECC → ODP via SLT (requires SLT license)
│ ├── NetSuite → API polling (no native CDC)
│ ├── Workday → RaaS polling or Integration Cloud
│ └── Dynamics 365 → Dataverse Change Tracking API
└── Error tolerance?
├── Zero-loss → Exactly-once (Debezium 3.3+) + dead letter queue
└── At-least-once → Default Debezium / GoldenGate behavior
Quick Reference
CDC Method Comparison
Method
Mechanism
Latency
DB Impact
Deletes?
Complexity
Cost
Log-based (Debezium)
Transaction log tailing
Sub-second
Minimal
Yes
Medium-High
Free + Kafka infra
Log-based (GoldenGate)
Oracle redo log mining
Sub-second
Minimal
Yes
High
$17,500/processor
Vendor-native (SF CDC)
Platform event bus
Sub-second
None
Yes
Low
Included
Vendor-native (SAP ODP)
ODP delta queue
Seconds
Low
Yes
Medium
SAP + SLT license
Query-based polling
Timestamp/ID filter
Minutes
High
NO
Low
API call costs
Trigger-based
DB triggers + shadow table
Seconds
HIGH
Yes
Medium
Dev cost
Tool Selection Matrix
Factor
Debezium
GoldenGate
Salesforce CDC
SAP ODP/SLT
Polling
License cost
Free (Apache 2.0)
$17,500/processor
Included
SAP license
Free
Infrastructure
Kafka cluster
GG hub
None (SaaS)
SLT server
None
Supported sources
11 databases
Oracle primary
Salesforce only
SAP only
Any with API
Exactly-once
Yes (v3.3+)
Bounded recovery
At-most-once
At-least-once
N/A
Schema evolution
Auto-detect
DDL replication
Automatic
CDS-dependent
N/A
Operational complexity
Medium
High
Low
Medium
Low
Step-by-Step Integration Guide
1. Set up Debezium for PostgreSQL-backed ERP
Deploy Debezium via Kafka Connect to capture changes from a PostgreSQL-backed ERP database. [src1]
# Enable logical replication: wal_level=logical, max_replication_slots=4
psql -c "CREATE ROLE debezium_user WITH REPLICATION LOGIN PASSWORD 'secure_password';"
psql -c "GRANT SELECT ON ALL TABLES IN SCHEMA public TO debezium_user;"
Verify: Output contains JSON with "op": "c" (create), "op": "u" (update), or "op": "d" (delete).
4. Monitor replication slot health
Prevent WAL disk exhaustion by monitoring replication slot lag. [src1]
SELECT slot_name, active,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag_size
FROM pg_replication_slots WHERE slot_name = 'debezium_erp';
Verify: lag_size should be < 1 GB under normal operation.
Code Examples
Python: Subscribe to Salesforce CDC events via Pub/Sub API
PostgreSQL WAL disk exhaustion: Debezium replication slot prevents WAL cleanup. Fix: Monitor pg_replication_slots lag; set max_slot_wal_keep_size (PG 13+); alert at 50% disk. [src1]
Oracle supplemental logging disabled on new table: Silent data corruption. Fix: ALTER TABLE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS alongside DDL. [src3]
Salesforce subscriber falls behind 3-day window: Lost events, no replay. Fix: Persist CDC events to Kafka immediately; implement gap-triggered full re-sync. [src5]
Schema change breaks Avro serialization: Non-nullable column without default. Fix: Use BACKWARD compatible Schema Registry; add columns as nullable. [src1]
GoldenGate trail file disk full: Replicat falls behind Extract. Fix: Monitor trail lag; set PURGEOLDEXTRACTS; alert when lag > 1 hour. [src3]
SAP ODP delta queue overflow: Subscriber too slow. Fix: Configure queue retention in ODQMON; increase pipeline frequency. [src4]
Anti-Patterns
Wrong: Polling an ERP database on a timer
# BAD — Constant load, misses deletes, timestamp gaps lose data
def poll_for_changes():
cur.execute("SELECT * FROM orders WHERE updated_at > NOW() - INTERVAL '60s'")
# Misses deletes, adds constant query load, 60s latency minimum
Correct: Log-based CDC with Debezium
# GOOD — Zero query load, captures deletes, sub-second latency
for message in kafka_consumer:
event = message.value
op = event['op'] # c=create, u=update, d=delete
if op in ('c', 'u', 'r'): sync_to_target(event['after'])
elif op == 'd': delete_from_target(event['before'])
Wrong: Using GoldenGate for non-Oracle targets
# BAD — 8 processors * $17,500 = $140,000 + 22% annual support
# Paying Oracle license for non-Oracle target database
Wrong: Treating Salesforce CDC as durable event stream
# BAD — Events retained 3 days only; downtime > 3 days = data loss
subscribe_to("/data/AccountChangeEvent") # No durability guarantee
Correct: Buffer Salesforce CDC into Kafka
# GOOD — Persist to Kafka immediately for infinite retention
for event in sf_cdc_events:
kafka_producer.send("sf-account-changes", event)
# Now events durable beyond 3-day SF retention
Common Pitfalls
Orphaned replication slots: Deleting Debezium connector does NOT drop the PG replication slot. Fix: Always pg_drop_replication_slot() after deletion. [src1]
Initial snapshot overwhelms production: Reads all data, locks tables (MySQL global read lock). Fix: Use snapshot.mode=exported (PG, non-blocking) during off-peak. [src1]
Not testing at production volume: Works at 1K rows, fails at 100M. Fix: Load-test with production-scale data; size Kafka partitions for peak change rate. [src1]
Schema evolution without compatibility: Non-nullable column breaks Avro. Fix: Use Schema Registry BACKWARD compatibility; add columns as nullable. [src1]
GoldenGate non-prod licensing: Licenses apply to dev/staging/prod. Fix: Include all environments in license count or use Debezium for non-prod. [src2]
SAP ODP without SLT for tables: Application table CDC requires SLT proxy. Fix: Deploy SLT for table CDC; use direct ODP for CDS views. [src4]
Diagnostic Commands
# Check all Debezium connector statuses
curl -s http://localhost:8083/connectors | jq -r '.[]' | while read c; do
curl -s "http://localhost:8083/connectors/$c/status" | jq '{name: "'$c'", state: .connector.state}'
done
# PostgreSQL replication slot health
psql -c "SELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag FROM pg_replication_slots;"
# Kafka consumer group lag
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group erp-sync-group --describe
# Salesforce CDC event delivery check
curl -s "https://instance.salesforce.com/services/data/v66.0/limits" \
-H "Authorization: Bearer $TOKEN" | jq '.DailyDeliveredPlatformEvents'
# SAP: Transaction ODQMON (delta queues), SLT_DASHBOARD (replication status)
Version History & Compatibility
Tool
Version
Release
Status
Key Changes
Debezium
3.4.0
2025-12
Current
MariaDB GA, CockroachDB incubating
Debezium
3.3.0
2025-10
Supported
Exactly-once for all core connectors
Debezium
3.0.0
2024-10
Supported
Java 17+, Kafka 3.x baseline
Debezium
2.7.x
2024-06
EOL
Last Java 11 compatible
GoldenGate
23ai
2024
Current
Microservices architecture, REST API
GoldenGate
21c
2021
Supported
Classic + Microservices
Salesforce CDC
API v66.0
2026-02
Current
Spring '26; Pub/Sub API preferred
ADF SAP CDC
2025-02
2025-02
GA
ODP framework, CDS views, SLT
When to Use / When Not to Use
Use When
Don't Use When
Use Instead
Real-time replication from on-prem ERP with DB access
SaaS ERP with no DB access
API polling with timestamp filters
Need to capture DELETEs
Only need INSERT/UPDATE tracking
Timestamp-based polling
High change volume (>10K/day)
Low volume (<100/day)
Simple REST API polling
Kafka already deployed
No Kafka and no budget for it
Debezium Server to Kinesis/Pub/Sub
Oracle-to-Oracle with Oracle support needed
Heterogeneous targets
Debezium (free, multi-target)
Salesforce record change tracking
Need >3 day replay window
Buffer SF CDC to Kafka
Cross-System Comparison
Capability
Debezium
GoldenGate
Salesforce CDC
SAP ODP/SLT
Polling
CDC Method
Log-based
Log-based
Vendor events
App-layer delta
Query-based
Latency
Sub-second
Sub-second
Sub-second
Seconds
Minutes
Captures Deletes
Yes
Yes
Yes
Yes
No
Delivery
Exactly-once (3.3+)
At-least-once
At-most-once
At-least-once
Best-effort
DB Impact
Minimal
Minimal
None
Low
High
License Cost
Free
$17,500/proc
Included
SAP + SLT
Free
Sources
11 databases
Oracle primary
Salesforce only
SAP only
Any with API
Infrastructure
Kafka cluster
GG hub
None
SLT server
None
Event Retention
Configurable
Disk-bound
3 days
Configurable
N/A
Multi-Target
Native (Kafka)
Limited
Limited
One per queue
N/A
Important Caveats
Cloud ERP CDC gap is widening — more SaaS migrations mean fewer systems where log-based CDC is possible. Each vendor has unique event APIs with no standardized interface.
Debezium is not zero-ops — requires Kafka cluster management, Schema Registry, connector monitoring, replication slot cleanup, and snapshot planning.
GoldenGate cost is often underestimated — dual-processor licensing, 22% annual support, and non-production environment requirements can triple expected cost.
Exactly-once is still young — Debezium 3.3+ requires Kafka transactions on both sides. Most sink connectors do not yet support transactional consumption.
CDC is not a backup — streams forward from a point in time. Does not replace database backups, PITR, or disaster recovery strategies.