Canonical Data Model Design for Multi-ERP Integration
How do you design a canonical data model for multi-system ERP integration?
TL;DR
- Bottom line: A canonical data model (CDM) replaces N*(N-1)/2 point-to-point translations with 2*N adapters by defining a single, ERP-independent schema that all systems map to/from — essential when integrating 3+ systems, overkill for 2.
- Key limit: Start with 6-12 core attributes per entity (Customer, Order, Product) — over-engineering the model upfront is the #1 cause of failed CDM initiatives.
- Watch out for: Basing your canonical model on one ERP's schema (typically SAP or Salesforce) — this creates hidden coupling and forces every other system into that vendor's data model.
- Best for: Multi-ERP environments with 3+ systems exchanging Customer, Product, Order, Invoice, or Payment data through middleware or event-driven architecture.
- Versioning: Use semantic versioning (major.minor.patch) with backward-compatible additive changes; breaking changes require a new major version with a deprecation window.
System Profile
This is an architecture pattern card that applies across all major ERP systems. The canonical data model pattern is ERP-agnostic by design — it sits between systems as a neutral intermediary. The specific system adapters (transformers to/from canonical format) are system-specific, but the canonical schema itself is independent.
This card covers the design, versioning, and governance of canonical schemas for Customer, Product, Order, Invoice, Payment, and Employee entities across SAP S/4HANA, Oracle NetSuite, Dynamics 365, Salesforce, and Oracle ERP Cloud. It does NOT cover master data management (match/merge/survivorship), which is a separate discipline.
| System | Role | API Surface | Direction |
|---|---|---|---|
| SAP S/4HANA | ERP — financial/logistics master | OData v4, RFC/BAPI | Bidirectional |
| Oracle NetSuite | ERP — mid-market financial/inventory | REST, SuiteTalk, SuiteQL | Bidirectional |
| Microsoft Dynamics 365 F&O | ERP — finance/supply chain | OData v4, Web API, DMF | Bidirectional |
| Salesforce | CRM — customer/opportunity master | REST v62.0, Composite, Platform Events | Bidirectional |
| Oracle ERP Cloud | ERP — enterprise financial/procurement | REST, FBDI, Business Events | Bidirectional |
| iPaaS / Middleware | Integration orchestrator | Varies | Orchestrator |
API Surfaces & Capabilities
The canonical model is not an API surface itself — it is a schema contract that sits between system-specific adapters. Each system exposes its own API surfaces that the adapters consume.
| System | Primary API for Read | Primary API for Write | Event/CDC Support | Canonical Adapter Complexity |
|---|---|---|---|---|
| SAP S/4HANA | OData v4 (CDS views) | OData v4, BAPI | Business Events, AIF | High — deep customization, variant fields |
| NetSuite | SuiteQL, REST | SuiteTalk SOAP, REST | User Event Scripts, SuiteScript | Medium — flexible but governance-limited |
| Dynamics 365 | Web API (OData v4) | Web API, DMF | Dataverse webhooks, Dual Write | Medium — OData well-structured |
| Salesforce | REST API, SOQL | REST API, Composite | Platform Events, CDC | Medium — well-documented but governor limits |
| Oracle ERP Cloud | REST API | REST API, FBDI | Business Events, BICC | High — FBDI for bulk, REST for real-time |
Rate Limits & Quotas
Rate limits apply per-system at the adapter layer, not at the canonical model layer. Canonical model design must account for the slowest/most-restrictive system in the chain.
Adapter Throughput Planning
| System | Effective Read Rate | Effective Write Rate | Bulk Import | Bottleneck |
|---|---|---|---|---|
| SAP S/4HANA | ~5,000 records/min (OData) | ~1,000 records/min (BAPI) | IDocs (batch), BAPI mass | Custom code governor limits |
| NetSuite | ~2,000 records/min (SuiteQL) | ~500 records/min (SuiteTalk) | CSV import (500K records) | Governance units (10,000/script) |
| Dynamics 365 | ~10,000 records/min (Web API) | ~5,000 records/min (Web API) | DMF (millions) | Throttling at 6,000 req/5min |
| Salesforce | ~2,000 records/req (SOQL) | 200 records/req (Composite) | Bulk API 2.0 (150MB/file) | 100K API calls/24h (Enterprise) |
| Oracle ERP Cloud | ~5,000 records/min (REST) | ~1,000 records/min (REST) | FBDI (250MB/file) | Fair-use throttling |
Design Implication
When designing canonical transformations, batch size must match the most constrained system. If Salesforce limits you to 200 records per Composite request, your canonical batch processor should chunk at 200 even if SAP can handle 5,000 per call. [src2]
Authentication
Authentication is handled per-system adapter, not at the canonical model layer. Each adapter authenticates independently with its target system.
| System | Recommended Auth | Token Lifetime | Notes |
|---|---|---|---|
| SAP S/4HANA | OAuth 2.0 (SAP BTP) or X.509 | Session-based | Principal propagation for user-context |
| NetSuite | Token-Based Authentication (TBA) | Persistent tokens | OAuth 2.0 available but TBA simpler for S2S |
| Dynamics 365 | OAuth 2.0 Client Credentials | 1h access token | Azure AD app registration required |
| Salesforce | OAuth 2.0 JWT Bearer | 2h session | Connected app with digital certificate |
| Oracle ERP Cloud | OAuth 2.0 or Basic Auth | Token-based | Basic Auth still common for FBDI/ESS jobs |
Authentication Gotchas
- Canonical model transformations running as scheduled jobs must handle token refresh for ALL connected systems — if one token expires mid-batch, the entire canonical sync fails [src2]
- SAP OAuth tokens via BTP have different scopes than direct S/4HANA tokens — ensure the adapter's service account has the correct authorization objects [src5]
- Salesforce JWT bearer flow tokens are org-scoped — the integration user's profile determines which fields the adapter can read/write, potentially causing silent canonical mapping gaps [src2]
Constraints
- Not a merge of source schemas: A canonical model is a NEW model designed from business semantics, not a union of all fields from all systems. Merging schemas produces an unmaintainable "god model." [src2]
- Minimum 3 systems to justify: For 2 systems, direct field mapping has lower overhead and equivalent maintainability. CDM overhead pays off at 3+ systems. [src1, src4]
- Schema versioning is mandatory: Without semantic versioning, a single field rename breaks every adapter simultaneously. Use BACKWARD compatibility mode as default. [src5, src7]
- Additive-only evolution: Normal changes must add optional fields. Removing required fields, changing types, or renaming without aliases are breaking changes requiring a new major version. [src7]
- Domain ownership of adapters: Each system's adapter must be owned by that system's domain team. A central team owning all adapters becomes a bottleneck and single point of failure. [src5]
- Anti-Corruption Layer at boundaries: Do not force canonical formats inside bounded contexts. Use ACL adapters at domain boundaries. [src5]
Integration Pattern Decision Tree
START — User needs to share data across 3+ ERP systems
├── How many systems exchange data?
│ ├── 2 systems only
│ │ └── Direct field mapping — CDM overhead not justified
│ │ (But plan for CDM if 3rd system expected within 12 months)
│ ├── 3-5 systems
│ │ └── Canonical Data Model recommended
│ │ ├── Start with core entities (Customer, Order, Product)
│ │ ├── 6-12 attributes per entity initially
│ │ └── Add entities/attributes incrementally per use case
│ └── 6+ systems
│ └── Canonical Data Model essential
│ ├── Invest in schema registry (Confluent, Apicurio, AWS Glue)
│ ├── Formal governance with domain stewards
│ └── CI/CD compatibility checks on every schema change
├── What approach to canonical ownership?
│ ├── Centralized (one team owns all) → Anti-pattern for >3 systems
│ ├── Federated (domain adapters, governance board) → Recommended
│ └── Hybrid (central canonical, domain adapters) → Good starting point
├── What schema format?
│ ├── JSON Schema → REST APIs, event payloads, iPaaS
│ ├── Avro → Kafka event streaming, schema registry native
│ ├── Protobuf → gRPC, high-performance serialization
│ └── XML Schema (XSD) → SOAP, EDI, legacy middleware
└── What evolution strategy?
├── Additive only (minor versions) → default for all changes
├── Breaking change (major version) → deprecation window + migration plan
└── Transform-on-read → store canonical, transform at consumption
Quick Reference
Canonical Entity to System-Specific Mapping
| Canonical Entity | SAP S/4HANA | NetSuite | Dynamics 365 | Salesforce | Oracle ERP Cloud |
|---|---|---|---|---|---|
| Customer | BusinessPartner (BP) | customer | Account (Dataverse) | Account (sObject) | hz_parties / hz_cust_accounts |
| Product | A_Product | item | Product (Released) | Product2 | egp_system_items |
| Order | A_SalesOrder | salesOrder | SalesOrderHeader | Opportunity / Order | doo_order_headers |
| Invoice | BillingDocument | invoice | CustInvoiceJour | Invoice (custom) | ra_customer_trx |
| Payment | FI Document (BKPF/BSEG) | customerPayment | CustPaymJournalTrans | Payment (custom) | ap_checks / ar_cash_receipts |
| Employee | A_Employee | employee | HcmWorker | Contact / User | per_all_people_f |
| Vendor/Supplier | A_Supplier | vendor | VendTable | Account (Supplier type) | poz_suppliers |
| Address | A_BPAddress | address (subrecord) | LogisticsPostalAddress | Address (compound) | hz_locations |
Step-by-Step Integration Guide
1. Inventory systems and identify core entities
Map every system in your integration landscape and identify which business entities flow between them. Start with the highest-volume, highest-value entities. [src3]
Priority Matrix:
| Entity | Systems Sharing It | Volume/Day | Priority |
|-----------|-------------------|------------|----------|
| Customer | 5 (all ERPs + CRM)| 500 creates| P0 |
| Order | 4 (CRM + 3 ERPs) | 2,000 | P0 |
| Product | 4 (PIM + 3 ERPs) | 50 updates | P1 |
| Invoice | 3 (ERP + billing) | 5,000 | P1 |
| Payment | 2 (ERP + bank) | 1,000 | P2 |
Verify: Each entity appears in 3+ systems (otherwise use direct mapping for that entity).
2. Design the canonical schema per entity
For each P0 entity, define the canonical schema with 6-12 core attributes. Use JSON Schema (for REST/event payloads) or Avro (for Kafka streaming). [src3, src5]
# Validate canonical schema using ajv-cli
npm install -g ajv-cli
ajv validate -s canonical/Customer.v1.json -d sample-customer.json
# Expected: valid
Verify: Schema validates against sample payloads from each source system.
3. Build system-specific adapters (Anti-Corruption Layers)
Each adapter translates between one system's native format and the canonical format. Adapters are owned by domain teams. [src5]
def salesforce_to_canonical(sf_account: dict) -> dict:
"""Transform Salesforce Account to Canonical Customer."""
return {
"canonicalId": resolve_canonical_id("sfdc", sf_account["Id"]),
"legalName": sf_account["Name"],
"customerType": map_sf_type(sf_account.get("Type", "Prospect")),
"primaryEmail": sf_account.get("PersonEmail"),
"primaryPhone": normalize_e164(sf_account.get("Phone")),
"currency": sf_account.get("CurrencyIsoCode", "USD"),
"status": map_sf_status(sf_account.get("Account_Status__c", "Active")),
"systemReferences": [{
"system": "sfdc",
"externalId": sf_account["Id"],
"entityType": "Account",
}],
"schemaVersion": "1.0.0"
}
Verify: Run adapter against 100 real records, validate output against canonical schema.
4. Register schemas and enforce compatibility
Use a schema registry to store canonical schemas with version history and compatibility enforcement. [src7]
# Register schema in Confluent Schema Registry
curl -X POST http://schema-registry:8081/subjects/canonical-customer-value/versions \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schemaType":"JSON","schema":"..."}'
# Expected: {"id": 1}
# Set BACKWARD compatibility (new schemas can read old data)
curl -X PUT http://schema-registry:8081/config/canonical-customer-value \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"compatibility":"BACKWARD"}'
Verify: Attempt to register an incompatible schema change — registry should reject it.
5. Implement canonical transformation pipeline
Wire adapters into your integration platform (iPaaS, Kafka, or custom middleware). [src1, src5]
ADAPTERS = {
"sfdc": salesforce_to_canonical,
"sap-s4": sap_to_canonical,
"netsuite": netsuite_to_canonical,
"d365": dynamics_to_canonical,
}
def process_event(source_system: str, payload: dict) -> dict:
adapter = ADAPTERS.get(source_system)
canonical = adapter(payload)
validate(instance=canonical, schema=CUSTOMER_SCHEMA) # fail fast
return canonical
Verify: Process a test event from each source system, confirm output validates.
6. Implement schema evolution workflow
When new fields are needed, follow the additive evolution process. [src7]
# Check compatibility before registering new version
curl -X POST http://schema-registry:8081/compatibility/subjects/canonical-customer-value/versions/latest \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schemaType":"JSON","schema":"..."}'
# Expected: {"is_compatible": true}
Verify: Old adapter output (without new field) still validates against new schema version.
Code Examples
Python: Bidirectional adapter with canonical validation
# Input: Source system payload (any connected ERP)
# Output: Validated canonical Customer or validated system-specific payload
class CanonicalAdapter:
def __init__(self, schema_path: str, system_name: str):
with open(schema_path) as f:
self.schema = json.load(f)
self.system = system_name
def validate_canonical(self, canonical: dict) -> bool:
try:
validate(instance=canonical, schema=self.schema)
return True
except ValidationError as e:
raise ValueError(f"Validation failed for {self.system}: {e.message}")
class SAPCustomerAdapter(CanonicalAdapter):
SAP_STATUS_MAP = {"1": "active", "2": "blocked", "3": "inactive"}
def to_canonical(self, bp: dict) -> dict:
canonical = {
"canonicalId": self._resolve_id(bp["BusinessPartner"]),
"legalName": bp["BusinessPartnerFullName"],
"customerType": "organization" if bp.get("BusinessPartnerCategory") == "2" else "individual",
"status": self.SAP_STATUS_MAP.get(bp.get("AuthorizationGroup", "1"), "active"),
"schemaVersion": "1.0.0"
}
self.validate_canonical(canonical)
return canonical
JavaScript/Node.js: Schema version migration utility
// Input: Canonical payload in schema v1.0.0
// Output: Canonical payload migrated to schema v1.1.0
const migrations = {
"1.0.0->1.1.0": (payload) => ({
...payload,
industryCode: null, // New optional field
schemaVersion: "1.1.0",
}),
};
function migratePayload(payload) {
const key = `${payload.schemaVersion}->1.1.0`;
const migrated = migrations[key](payload);
const validate = ajv.compile(schemas["1.1.0"]);
if (!validate(migrated)) throw new Error(JSON.stringify(validate.errors));
return migrated;
}
Data Mapping
Field Mapping Reference — Canonical Customer to System-Specific
| Canonical Field | SAP S/4HANA | NetSuite | Dynamics 365 | Salesforce | Gotcha |
|---|---|---|---|---|---|
| canonicalId | Z_CANONICAL_ID (custom) | custentity_canonical_id | cr_canonicalid (custom) | Canonical_ID__c | Must be indexed in every system for reverse lookup |
| legalName | BusinessPartnerFullName | companyName | Name (Account) | Name | SAP max 81 chars, NetSuite max 83, SF max 255 |
| customerType | BusinessPartnerCategory | isPerson (boolean) | RelationshipType | RecordTypeId | SAP uses codes, NetSuite boolean, SF record types |
| primaryEmail | EmailAddress (BP contact) | EMailAddress | PersonEmail | SAP email on contact, not BP directly | |
| currency | T001-WAERS (company code) | currency.refName | TransactionCurrencyId | CurrencyIsoCode | SAP company code default vs per-record |
| status | AuthorizationGroup | entityStatus.refName | StateCode (0/1) | Account_Status__c | Each system has different lifecycle states |
| taxId | TaxNumber1 | defaultTaxReg | VATNum | Tax_ID__c (custom) | Format varies by jurisdiction |
| addresses | A_BPAddress (1:N, time-dependent) | addressbook (sublist) | LogisticsPostalAddress | BillingAddress, ShippingAddress | SAP has complex address time-dependency |
Data Type Gotchas
- Datetime timezone mismatch: SAP stores in user timezone, Salesforce UTC, NetSuite company timezone, D365 UTC. All canonical timestamps MUST be UTC. [src2]
- Amount representation: SAP stores in smallest currency unit (cents) for some currencies; Salesforce/NetSuite store as decimal. Canonical uses decimal with explicit currency code. [src5]
- Boolean representation: SAP uses "X"/empty, NetSuite true/false, D365 uses 0/1, Salesforce true/false. Normalize to JSON boolean. [src4]
- Multi-value fields: Salesforce picklists use semicolons, SAP separate table entries, NetSuite arrays. Canonical uses JSON arrays. [src2]
- ID formats: SAP 10-digit zero-padded, Salesforce 18-char alphanumeric, NetSuite integer, D365 GUID. Canonical uses URN with UUID. [src5]
Error Handling & Failure Points
Common Error Codes
| Error | System | Meaning | Resolution |
|---|---|---|---|
| VALIDATION_FAILED | Canonical | Payload doesn't conform to schema | Check adapter transformation — usually missing required field or wrong type |
| SCHEMA_VERSION_MISMATCH | Canonical | schemaVersion doesn't match expected | Run migration utility to upgrade payload |
| DUPLICATE_CANONICAL_ID | Canonical | Same canonicalId with different systemReferences | Route to MDM for deduplication |
| ADAPTER_NOT_FOUND | Pipeline | No adapter for source system | Register adapter before connecting new system |
| INCOMPATIBLE_SCHEMA | Schema Registry | Change breaks backward compatibility | Revert; create new major version if needed |
| TRANSFORM_TIMEOUT | Adapter | Adapter took >30s to transform | Check source API response time; batch smaller |
| DEAD_LETTER_QUEUE_FULL | Pipeline | DLQ capacity exceeded | Alert ops; investigate high failure rate root cause |
Failure Points in Production
- Schema drift between environments: Dev/staging canonical schemas diverge from production after partial deployment. Fix:
CI/CD pipeline that validates schema compatibility before deploying any adapter.[src7] - Adapter version skew: Adapter A produces v1.1.0 but adapter B only understands v1.0.0. Fix:
BACKWARD compatibility mode ensures old consumers always read new data.[src7] - Canonical ID resolution failure: Record exists in source but hasn't synced to canonical ID registry. Fix:
Provisional ID assignment with reconciliation job within 15 minutes.[src5] - Silent data loss on optional fields: Required source field mapped to optional canonical field, then target ignores null. Fix:
Document field cardinality matrix per system.[src3] - Timezone corruption: Adapter doesn't convert timezone before canonical write — dates shift by hours. Fix:
UTC-only policy in canonical; timezone validation in adapter tests.[src2]
Anti-Patterns
Wrong: Basing canonical model on one ERP's schema
# BAD — Using Salesforce Account fields as canonical model
canonical_customer = {
"Id": sf_account["Id"], # SF-specific 18-char ID
"Name": sf_account["Name"], # SF field name
"BillingStreet": sf_account["BillingStreet"], # SF compound address
"Type": sf_account["Type"], # SF picklist values
"OwnerId": sf_account["OwnerId"], # SF-specific concept
}
# Every other system must translate TO Salesforce's model.
# When SF changes a field, ALL adapters break.
Correct: Neutral canonical model from business semantics
# GOOD — Business-driven model independent of any system
canonical_customer = {
"canonicalId": "urn:acme:customer:550e8400-e29b-41d4-a716-446655440000",
"legalName": "Acme Corporation",
"customerType": "organization",
"status": "active",
"systemReferences": [
{"system": "sfdc", "externalId": "001xx000003DGbzAAG"},
{"system": "sap-s4", "externalId": "0001000042"},
{"system": "netsuite", "externalId": "12345"},
],
"schemaVersion": "1.0.0"
}
# Each system maps to/from a neutral model.
# Changing one system only affects that system's adapter.
Wrong: Trying to model everything upfront
# BAD — 200+ fields in canonical Customer from day one
canonical_customer_schema = {
"properties": {
"customerId": {}, "legalName": {}, "tradingName": {},
"dbaName": {}, "parentCompany": {}, "ultimateParent": {},
"dunsNumber": {}, "sic_code": {}, "naics_code": {},
# ... 180 more fields ...
}
}
# 6 months of design meetings, no integration delivered.
# 80% of fields are empty in most systems.
Correct: Start with core attributes, extend additively
# GOOD — 12 core fields, extend per use case
canonical_customer_v1 = {
"properties": {
"canonicalId": {}, # Immutable identifier
"legalName": {}, # Required
"customerType": {}, # Required
"primaryEmail": {}, # Optional
"currency": {}, # Required
"status": {}, # Required
"addresses": {}, # Array
"systemReferences": {}, # Cross-reference IDs
"createdAt": {}, # Audit trail
"schemaVersion": {}, # Evolution tracking
}
}
# v1.1 adds: industryCode, taxId (optional, backward compatible)
# v2.0 (breaking): changes addressType enum — requires migration
Wrong: Central team owns all adapters
# BAD — Integration CoE (5 people) owns everything
# Canonical schema + ALL adapters + pipeline + monitoring
# Result: 5 people become bottleneck. Average lead time: 6 weeks.
Correct: Federated ownership with governance board
# GOOD — Domain teams own adapters, governance board owns canonical
# CRM Team: Salesforce adapter
# Finance Team: SAP + NetSuite adapters
# Platform Team: Schema registry + CI/CD + monitoring
# Governance Board: Weekly 30-min schema review
# Result: Average lead time for new field: 1-2 weeks.
Common Pitfalls
- Over-engineering the initial model: Teams spend 6+ months designing a "complete" model before building adapters. Fix:
Start with 3 entities, 6-12 fields each. Ship an adapter in 2-4 weeks. Add fields as use cases demand.[src3] - No schema versioning from day one: First schema change breaks all consumers. Fix:
Include schemaVersion as required field from the start. Register schemas with BACKWARD compatibility enforced.[src7] - Testing against mock data only: Adapters fail on real data with nulls, encoding issues, edge cases. Fix:
Test against anonymized production data extracts. Run nightly canonical validation against live data.[src5] - Missing cross-reference ID registry: Systems can't resolve canonical IDs. Fix:
Implement a dedicated cross-reference table (canonical_id, system, external_id) before any data flows.[src2] - Schema evolution without deprecation windows: Breaking changes break consumers instantly. Fix:
30-day notice, migration guide, 90-day dual-version support, consumer migration verification.[src7] - Treating canonical model as MDM hub: Teams expect CDM to deduplicate and manage golden records. Fix:
CDM is a schema contract, not a data store. For dedup/golden records, use MDM. CDM feeds INTO MDM.[src2]
Diagnostic Commands
# Validate canonical payload against schema
ajv validate -s schemas/canonical/Customer.v1.json -d payload.json --all-errors
# Expected: payload.json valid
# List registered schema versions
curl -s http://schema-registry:8081/subjects/canonical-customer-value/versions | jq .
# Expected: [1, 2, 3]
# Check compatibility of new schema version
curl -s -X POST http://schema-registry:8081/compatibility/subjects/canonical-customer-value/versions/latest \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schemaType":"JSON","schema":"..."}' | jq .
# Expected: {"is_compatible": true}
# Query cross-references for a canonical ID
SELECT system, external_id, last_synced_at
FROM canonical_cross_references
WHERE canonical_id = 'urn:acme:customer:550e8400-...';
# Count validation failures in last 24h
SELECT source_system, COUNT(*) as failures
FROM canonical_validation_log
WHERE created_at > NOW() - INTERVAL '24 hours'
GROUP BY source_system ORDER BY failures DESC;
Version History & Compatibility
| Era | Approach | Schema Format | Governance | Status |
|---|---|---|---|---|
| 2003-2015 | Enterprise-wide XML canonical | XML Schema (XSD) | Central governance committee | Legacy |
| 2015-2020 | API-first canonical (REST/JSON) | JSON Schema, OpenAPI | API management platform | Mature |
| 2020-2024 | Event-driven canonical (streaming) | Avro, Protobuf, JSON Schema | Schema registry | Current |
| 2024-2026 | Domain-driven canonical (bounded contexts) | JSON Schema + AsyncAPI | Federated governance + CI/CD | Emerging |
Schema versions follow semantic versioning. Minor versions are perpetually backward compatible. Major versions receive minimum 90-day deprecation window with dual-version support. [src7]
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Integrating 3+ systems sharing Customer, Order, or Product data | Only 2 systems with stable schemas | Direct field mapping with transformation layer |
| Systems frequently replaced/upgraded (M&A, cloud migration) | All systems from same vendor (e.g., all-SAP) | Vendor-native integration (SAP CPI, Oracle SOA) |
| Multiple teams build integrations independently | Single integration team handles all flows | Point-to-point with shared mapping documentation |
| Event-driven architecture with Kafka/message bus | Simple batch ETL running once nightly | ETL/ELT with source-to-target mapping |
| Long-term strategic integration (3+ year horizon) | Short-term project or POC (< 6 months) | Direct mapping — refactor to CDM later |
Cross-System Comparison
How Each ERP Maps to Canonical Entities
| Capability | SAP S/4HANA | NetSuite | Dynamics 365 | Salesforce | Oracle ERP Cloud |
|---|---|---|---|---|---|
| Customer entity name | BusinessPartner (BP) | customer | Account | Account | HZ_PARTIES |
| Customer ID format | 10-digit zero-padded | Integer | GUID | 18-char alphanumeric | Number |
| Address model | Complex (time-dependent) | Sublist (addressbook) | Separate entity | Compound fields | HZ_LOCATIONS (shared) |
| Order header | A_SalesOrder | salesOrder | SalesOrderHeader | Opportunity or Order | DOO_ORDER_HEADERS |
| Line item model | A_SalesOrderItem | salesOrderItem.item | SalesOrderLine | OpportunityLineItem | DOO_ORDER_LINES |
| Product/Item | A_Product + A_ProductPlant | item | EcoResProduct | Product2 | EGP_SYSTEM_ITEMS |
| Amount precision | Up to 3 decimals | 2 decimals | Up to 10 decimals | 2 decimals | Currency-dependent |
| DateTime format | YYYYMMDD (legacy) / ISO 8601 | ISO 8601 | ISO 8601 | ISO 8601 | ISO 8601 |
| Null handling | Initial values ('' / 0) | null | null | null | null |
| Enum representation | Domain values (coded) | Picklist refs (internalId) | OptionSet (integer) | Picklist (string name) | Lookup codes (varchar) |
Important Caveats
- Canonical model is not MDM: CDM standardizes the schema; MDM deduplicates records, creates golden records, and manages survivorship. They are complementary, not interchangeable.
- Governance overhead is real: Without a governance board, canonical schemas diverge into "canonical" in name only — each team extends independently, producing incompatible versions.
- Performance impact of double transformation: Every exchange goes through two translations (source-to-canonical, canonical-to-target). For latency-sensitive flows (<100ms), consider direct mapping for that specific flow.
- Organizational alignment > technology: The biggest risk is not schema design but organizational buy-in. If domain teams refuse canonical adapters, the model becomes shelfware.
- Not all entities deserve canonical treatment: Low-volume, system-specific entities (Salesforce Campaigns, SAP Profit Centers) may not justify CDM. Apply selectively to high-frequency, cross-system entities.
- Schema registries add operational complexity: For teams without Kafka/event streaming, JSON Schema files in version control with CI/CD validation checks may suffice.