This is an architecture pattern card that applies across all major ERP systems. The canonical data model pattern is ERP-agnostic by design — it sits between systems as a neutral intermediary. The specific system adapters (transformers to/from canonical format) are system-specific, but the canonical schema itself is independent.
This card covers the design, versioning, and governance of canonical schemas for Customer, Product, Order, Invoice, Payment, and Employee entities across SAP S/4HANA, Oracle NetSuite, Dynamics 365, Salesforce, and Oracle ERP Cloud. It does NOT cover master data management (match/merge/survivorship), which is a separate discipline.
| System | Role | API Surface | Direction |
|---|---|---|---|
| SAP S/4HANA | ERP — financial/logistics master | OData v4, RFC/BAPI | Bidirectional |
| Oracle NetSuite | ERP — mid-market financial/inventory | REST, SuiteTalk, SuiteQL | Bidirectional |
| Microsoft Dynamics 365 F&O | ERP — finance/supply chain | OData v4, Web API, DMF | Bidirectional |
| Salesforce | CRM — customer/opportunity master | REST v62.0, Composite, Platform Events | Bidirectional |
| Oracle ERP Cloud | ERP — enterprise financial/procurement | REST, FBDI, Business Events | Bidirectional |
| iPaaS / Middleware | Integration orchestrator | Varies | Orchestrator |
The canonical model is not an API surface itself — it is a schema contract that sits between system-specific adapters. Each system exposes its own API surfaces that the adapters consume.
| System | Primary API for Read | Primary API for Write | Event/CDC Support | Canonical Adapter Complexity |
|---|---|---|---|---|
| SAP S/4HANA | OData v4 (CDS views) | OData v4, BAPI | Business Events, AIF | High — deep customization, variant fields |
| NetSuite | SuiteQL, REST | SuiteTalk SOAP, REST | User Event Scripts, SuiteScript | Medium — flexible but governance-limited |
| Dynamics 365 | Web API (OData v4) | Web API, DMF | Dataverse webhooks, Dual Write | Medium — OData well-structured |
| Salesforce | REST API, SOQL | REST API, Composite | Platform Events, CDC | Medium — well-documented but governor limits |
| Oracle ERP Cloud | REST API | REST API, FBDI | Business Events, BICC | High — FBDI for bulk, REST for real-time |
Rate limits apply per-system at the adapter layer, not at the canonical model layer. Canonical model design must account for the slowest/most-restrictive system in the chain.
| System | Effective Read Rate | Effective Write Rate | Bulk Import | Bottleneck |
|---|---|---|---|---|
| SAP S/4HANA | ~5,000 records/min (OData) | ~1,000 records/min (BAPI) | IDocs (batch), BAPI mass | Custom code governor limits |
| NetSuite | ~2,000 records/min (SuiteQL) | ~500 records/min (SuiteTalk) | CSV import (500K records) | Governance units (10,000/script) |
| Dynamics 365 | ~10,000 records/min (Web API) | ~5,000 records/min (Web API) | DMF (millions) | Throttling at 6,000 req/5min |
| Salesforce | ~2,000 records/req (SOQL) | 200 records/req (Composite) | Bulk API 2.0 (150MB/file) | 100K API calls/24h (Enterprise) |
| Oracle ERP Cloud | ~5,000 records/min (REST) | ~1,000 records/min (REST) | FBDI (250MB/file) | Fair-use throttling |
When designing canonical transformations, batch size must match the most constrained system. If Salesforce limits you to 200 records per Composite request, your canonical batch processor should chunk at 200 even if SAP can handle 5,000 per call. [src2]
Authentication is handled per-system adapter, not at the canonical model layer. Each adapter authenticates independently with its target system.
| System | Recommended Auth | Token Lifetime | Notes |
|---|---|---|---|
| SAP S/4HANA | OAuth 2.0 (SAP BTP) or X.509 | Session-based | Principal propagation for user-context |
| NetSuite | Token-Based Authentication (TBA) | Persistent tokens | OAuth 2.0 available but TBA simpler for S2S |
| Dynamics 365 | OAuth 2.0 Client Credentials | 1h access token | Azure AD app registration required |
| Salesforce | OAuth 2.0 JWT Bearer | 2h session | Connected app with digital certificate |
| Oracle ERP Cloud | OAuth 2.0 or Basic Auth | Token-based | Basic Auth still common for FBDI/ESS jobs |
START — User needs to share data across 3+ ERP systems
├── How many systems exchange data?
│ ├── 2 systems only
│ │ └── Direct field mapping — CDM overhead not justified
│ │ (But plan for CDM if 3rd system expected within 12 months)
│ ├── 3-5 systems
│ │ └── Canonical Data Model recommended
│ │ ├── Start with core entities (Customer, Order, Product)
│ │ ├── 6-12 attributes per entity initially
│ │ └── Add entities/attributes incrementally per use case
│ └── 6+ systems
│ └── Canonical Data Model essential
│ ├── Invest in schema registry (Confluent, Apicurio, AWS Glue)
│ ├── Formal governance with domain stewards
│ └── CI/CD compatibility checks on every schema change
├── What approach to canonical ownership?
│ ├── Centralized (one team owns all) → Anti-pattern for >3 systems
│ ├── Federated (domain adapters, governance board) → Recommended
│ └── Hybrid (central canonical, domain adapters) → Good starting point
├── What schema format?
│ ├── JSON Schema → REST APIs, event payloads, iPaaS
│ ├── Avro → Kafka event streaming, schema registry native
│ ├── Protobuf → gRPC, high-performance serialization
│ └── XML Schema (XSD) → SOAP, EDI, legacy middleware
└── What evolution strategy?
├── Additive only (minor versions) → default for all changes
├── Breaking change (major version) → deprecation window + migration plan
└── Transform-on-read → store canonical, transform at consumption
| Canonical Entity | SAP S/4HANA | NetSuite | Dynamics 365 | Salesforce | Oracle ERP Cloud |
|---|---|---|---|---|---|
| Customer | BusinessPartner (BP) | customer | Account (Dataverse) | Account (sObject) | hz_parties / hz_cust_accounts |
| Product | A_Product | item | Product (Released) | Product2 | egp_system_items |
| Order | A_SalesOrder | salesOrder | SalesOrderHeader | Opportunity / Order | doo_order_headers |
| Invoice | BillingDocument | invoice | CustInvoiceJour | Invoice (custom) | ra_customer_trx |
| Payment | FI Document (BKPF/BSEG) | customerPayment | CustPaymJournalTrans | Payment (custom) | ap_checks / ar_cash_receipts |
| Employee | A_Employee | employee | HcmWorker | Contact / User | per_all_people_f |
| Vendor/Supplier | A_Supplier | vendor | VendTable | Account (Supplier type) | poz_suppliers |
| Address | A_BPAddress | address (subrecord) | LogisticsPostalAddress | Address (compound) | hz_locations |
Map every system in your integration landscape and identify which business entities flow between them. Start with the highest-volume, highest-value entities. [src3]
Priority Matrix:
| Entity | Systems Sharing It | Volume/Day | Priority |
|-----------|-------------------|------------|----------|
| Customer | 5 (all ERPs + CRM)| 500 creates| P0 |
| Order | 4 (CRM + 3 ERPs) | 2,000 | P0 |
| Product | 4 (PIM + 3 ERPs) | 50 updates | P1 |
| Invoice | 3 (ERP + billing) | 5,000 | P1 |
| Payment | 2 (ERP + bank) | 1,000 | P2 |
Verify: Each entity appears in 3+ systems (otherwise use direct mapping for that entity).
For each P0 entity, define the canonical schema with 6-12 core attributes. Use JSON Schema (for REST/event payloads) or Avro (for Kafka streaming). [src3, src5]
# Validate canonical schema using ajv-cli
npm install -g ajv-cli
ajv validate -s canonical/Customer.v1.json -d sample-customer.json
# Expected: valid
Verify: Schema validates against sample payloads from each source system.
Each adapter translates between one system's native format and the canonical format. Adapters are owned by domain teams. [src5]
def salesforce_to_canonical(sf_account: dict) -> dict:
"""Transform Salesforce Account to Canonical Customer."""
return {
"canonicalId": resolve_canonical_id("sfdc", sf_account["Id"]),
"legalName": sf_account["Name"],
"customerType": map_sf_type(sf_account.get("Type", "Prospect")),
"primaryEmail": sf_account.get("PersonEmail"),
"primaryPhone": normalize_e164(sf_account.get("Phone")),
"currency": sf_account.get("CurrencyIsoCode", "USD"),
"status": map_sf_status(sf_account.get("Account_Status__c", "Active")),
"systemReferences": [{
"system": "sfdc",
"externalId": sf_account["Id"],
"entityType": "Account",
}],
"schemaVersion": "1.0.0"
}
Verify: Run adapter against 100 real records, validate output against canonical schema.
Use a schema registry to store canonical schemas with version history and compatibility enforcement. [src7]
# Register schema in Confluent Schema Registry
curl -X POST http://schema-registry:8081/subjects/canonical-customer-value/versions \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schemaType":"JSON","schema":"..."}'
# Expected: {"id": 1}
# Set BACKWARD compatibility (new schemas can read old data)
curl -X PUT http://schema-registry:8081/config/canonical-customer-value \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"compatibility":"BACKWARD"}'
Verify: Attempt to register an incompatible schema change — registry should reject it.
Wire adapters into your integration platform (iPaaS, Kafka, or custom middleware). [src1, src5]
ADAPTERS = {
"sfdc": salesforce_to_canonical,
"sap-s4": sap_to_canonical,
"netsuite": netsuite_to_canonical,
"d365": dynamics_to_canonical,
}
def process_event(source_system: str, payload: dict) -> dict:
adapter = ADAPTERS.get(source_system)
canonical = adapter(payload)
validate(instance=canonical, schema=CUSTOMER_SCHEMA) # fail fast
return canonical
Verify: Process a test event from each source system, confirm output validates.
When new fields are needed, follow the additive evolution process. [src7]
# Check compatibility before registering new version
curl -X POST http://schema-registry:8081/compatibility/subjects/canonical-customer-value/versions/latest \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schemaType":"JSON","schema":"..."}'
# Expected: {"is_compatible": true}
Verify: Old adapter output (without new field) still validates against new schema version.
# Input: Source system payload (any connected ERP)
# Output: Validated canonical Customer or validated system-specific payload
class CanonicalAdapter:
def __init__(self, schema_path: str, system_name: str):
with open(schema_path) as f:
self.schema = json.load(f)
self.system = system_name
def validate_canonical(self, canonical: dict) -> bool:
try:
validate(instance=canonical, schema=self.schema)
return True
except ValidationError as e:
raise ValueError(f"Validation failed for {self.system}: {e.message}")
class SAPCustomerAdapter(CanonicalAdapter):
SAP_STATUS_MAP = {"1": "active", "2": "blocked", "3": "inactive"}
def to_canonical(self, bp: dict) -> dict:
canonical = {
"canonicalId": self._resolve_id(bp["BusinessPartner"]),
"legalName": bp["BusinessPartnerFullName"],
"customerType": "organization" if bp.get("BusinessPartnerCategory") == "2" else "individual",
"status": self.SAP_STATUS_MAP.get(bp.get("AuthorizationGroup", "1"), "active"),
"schemaVersion": "1.0.0"
}
self.validate_canonical(canonical)
return canonical
// Input: Canonical payload in schema v1.0.0
// Output: Canonical payload migrated to schema v1.1.0
const migrations = {
"1.0.0->1.1.0": (payload) => ({
...payload,
industryCode: null, // New optional field
schemaVersion: "1.1.0",
}),
};
function migratePayload(payload) {
const key = `${payload.schemaVersion}->1.1.0`;
const migrated = migrations[key](payload);
const validate = ajv.compile(schemas["1.1.0"]);
if (!validate(migrated)) throw new Error(JSON.stringify(validate.errors));
return migrated;
}
| Canonical Field | SAP S/4HANA | NetSuite | Dynamics 365 | Salesforce | Gotcha |
|---|---|---|---|---|---|
| canonicalId | Z_CANONICAL_ID (custom) | custentity_canonical_id | cr_canonicalid (custom) | Canonical_ID__c | Must be indexed in every system for reverse lookup |
| legalName | BusinessPartnerFullName | companyName | Name (Account) | Name | SAP max 81 chars, NetSuite max 83, SF max 255 |
| customerType | BusinessPartnerCategory | isPerson (boolean) | RelationshipType | RecordTypeId | SAP uses codes, NetSuite boolean, SF record types |
| primaryEmail | EmailAddress (BP contact) | EMailAddress | PersonEmail | SAP email on contact, not BP directly | |
| currency | T001-WAERS (company code) | currency.refName | TransactionCurrencyId | CurrencyIsoCode | SAP company code default vs per-record |
| status | AuthorizationGroup | entityStatus.refName | StateCode (0/1) | Account_Status__c | Each system has different lifecycle states |
| taxId | TaxNumber1 | defaultTaxReg | VATNum | Tax_ID__c (custom) | Format varies by jurisdiction |
| addresses | A_BPAddress (1:N, time-dependent) | addressbook (sublist) | LogisticsPostalAddress | BillingAddress, ShippingAddress | SAP has complex address time-dependency |
| Error | System | Meaning | Resolution |
|---|---|---|---|
| VALIDATION_FAILED | Canonical | Payload doesn't conform to schema | Check adapter transformation — usually missing required field or wrong type |
| SCHEMA_VERSION_MISMATCH | Canonical | schemaVersion doesn't match expected | Run migration utility to upgrade payload |
| DUPLICATE_CANONICAL_ID | Canonical | Same canonicalId with different systemReferences | Route to MDM for deduplication |
| ADAPTER_NOT_FOUND | Pipeline | No adapter for source system | Register adapter before connecting new system |
| INCOMPATIBLE_SCHEMA | Schema Registry | Change breaks backward compatibility | Revert; create new major version if needed |
| TRANSFORM_TIMEOUT | Adapter | Adapter took >30s to transform | Check source API response time; batch smaller |
| DEAD_LETTER_QUEUE_FULL | Pipeline | DLQ capacity exceeded | Alert ops; investigate high failure rate root cause |
CI/CD pipeline that validates schema compatibility before deploying any adapter. [src7]BACKWARD compatibility mode ensures old consumers always read new data. [src7]Provisional ID assignment with reconciliation job within 15 minutes. [src5]Document field cardinality matrix per system. [src3]UTC-only policy in canonical; timezone validation in adapter tests. [src2]# BAD — Using Salesforce Account fields as canonical model
canonical_customer = {
"Id": sf_account["Id"], # SF-specific 18-char ID
"Name": sf_account["Name"], # SF field name
"BillingStreet": sf_account["BillingStreet"], # SF compound address
"Type": sf_account["Type"], # SF picklist values
"OwnerId": sf_account["OwnerId"], # SF-specific concept
}
# Every other system must translate TO Salesforce's model.
# When SF changes a field, ALL adapters break.
# GOOD — Business-driven model independent of any system
canonical_customer = {
"canonicalId": "urn:acme:customer:550e8400-e29b-41d4-a716-446655440000",
"legalName": "Acme Corporation",
"customerType": "organization",
"status": "active",
"systemReferences": [
{"system": "sfdc", "externalId": "001xx000003DGbzAAG"},
{"system": "sap-s4", "externalId": "0001000042"},
{"system": "netsuite", "externalId": "12345"},
],
"schemaVersion": "1.0.0"
}
# Each system maps to/from a neutral model.
# Changing one system only affects that system's adapter.
# BAD — 200+ fields in canonical Customer from day one
canonical_customer_schema = {
"properties": {
"customerId": {}, "legalName": {}, "tradingName": {},
"dbaName": {}, "parentCompany": {}, "ultimateParent": {},
"dunsNumber": {}, "sic_code": {}, "naics_code": {},
# ... 180 more fields ...
}
}
# 6 months of design meetings, no integration delivered.
# 80% of fields are empty in most systems.
# GOOD — 12 core fields, extend per use case
canonical_customer_v1 = {
"properties": {
"canonicalId": {}, # Immutable identifier
"legalName": {}, # Required
"customerType": {}, # Required
"primaryEmail": {}, # Optional
"currency": {}, # Required
"status": {}, # Required
"addresses": {}, # Array
"systemReferences": {}, # Cross-reference IDs
"createdAt": {}, # Audit trail
"schemaVersion": {}, # Evolution tracking
}
}
# v1.1 adds: industryCode, taxId (optional, backward compatible)
# v2.0 (breaking): changes addressType enum — requires migration
# BAD — Integration CoE (5 people) owns everything
# Canonical schema + ALL adapters + pipeline + monitoring
# Result: 5 people become bottleneck. Average lead time: 6 weeks.
# GOOD — Domain teams own adapters, governance board owns canonical
# CRM Team: Salesforce adapter
# Finance Team: SAP + NetSuite adapters
# Platform Team: Schema registry + CI/CD + monitoring
# Governance Board: Weekly 30-min schema review
# Result: Average lead time for new field: 1-2 weeks.
Start with 3 entities, 6-12 fields each. Ship an adapter in 2-4 weeks. Add fields as use cases demand. [src3]Include schemaVersion as required field from the start. Register schemas with BACKWARD compatibility enforced. [src7]Test against anonymized production data extracts. Run nightly canonical validation against live data. [src5]Implement a dedicated cross-reference table (canonical_id, system, external_id) before any data flows. [src2]30-day notice, migration guide, 90-day dual-version support, consumer migration verification. [src7]CDM is a schema contract, not a data store. For dedup/golden records, use MDM. CDM feeds INTO MDM. [src2]# Validate canonical payload against schema
ajv validate -s schemas/canonical/Customer.v1.json -d payload.json --all-errors
# Expected: payload.json valid
# List registered schema versions
curl -s http://schema-registry:8081/subjects/canonical-customer-value/versions | jq .
# Expected: [1, 2, 3]
# Check compatibility of new schema version
curl -s -X POST http://schema-registry:8081/compatibility/subjects/canonical-customer-value/versions/latest \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schemaType":"JSON","schema":"..."}' | jq .
# Expected: {"is_compatible": true}
# Query cross-references for a canonical ID
SELECT system, external_id, last_synced_at
FROM canonical_cross_references
WHERE canonical_id = 'urn:acme:customer:550e8400-...';
# Count validation failures in last 24h
SELECT source_system, COUNT(*) as failures
FROM canonical_validation_log
WHERE created_at > NOW() - INTERVAL '24 hours'
GROUP BY source_system ORDER BY failures DESC;
| Era | Approach | Schema Format | Governance | Status |
|---|---|---|---|---|
| 2003-2015 | Enterprise-wide XML canonical | XML Schema (XSD) | Central governance committee | Legacy |
| 2015-2020 | API-first canonical (REST/JSON) | JSON Schema, OpenAPI | API management platform | Mature |
| 2020-2024 | Event-driven canonical (streaming) | Avro, Protobuf, JSON Schema | Schema registry | Current |
| 2024-2026 | Domain-driven canonical (bounded contexts) | JSON Schema + AsyncAPI | Federated governance + CI/CD | Emerging |
Schema versions follow semantic versioning. Minor versions are perpetually backward compatible. Major versions receive minimum 90-day deprecation window with dual-version support. [src7]
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Integrating 3+ systems sharing Customer, Order, or Product data | Only 2 systems with stable schemas | Direct field mapping with transformation layer |
| Systems frequently replaced/upgraded (M&A, cloud migration) | All systems from same vendor (e.g., all-SAP) | Vendor-native integration (SAP CPI, Oracle SOA) |
| Multiple teams build integrations independently | Single integration team handles all flows | Point-to-point with shared mapping documentation |
| Event-driven architecture with Kafka/message bus | Simple batch ETL running once nightly | ETL/ELT with source-to-target mapping |
| Long-term strategic integration (3+ year horizon) | Short-term project or POC (< 6 months) | Direct mapping — refactor to CDM later |
| Capability | SAP S/4HANA | NetSuite | Dynamics 365 | Salesforce | Oracle ERP Cloud |
|---|---|---|---|---|---|
| Customer entity name | BusinessPartner (BP) | customer | Account | Account | HZ_PARTIES |
| Customer ID format | 10-digit zero-padded | Integer | GUID | 18-char alphanumeric | Number |
| Address model | Complex (time-dependent) | Sublist (addressbook) | Separate entity | Compound fields | HZ_LOCATIONS (shared) |
| Order header | A_SalesOrder | salesOrder | SalesOrderHeader | Opportunity or Order | DOO_ORDER_HEADERS |
| Line item model | A_SalesOrderItem | salesOrderItem.item | SalesOrderLine | OpportunityLineItem | DOO_ORDER_LINES |
| Product/Item | A_Product + A_ProductPlant | item | EcoResProduct | Product2 | EGP_SYSTEM_ITEMS |
| Amount precision | Up to 3 decimals | 2 decimals | Up to 10 decimals | 2 decimals | Currency-dependent |
| DateTime format | YYYYMMDD (legacy) / ISO 8601 | ISO 8601 | ISO 8601 | ISO 8601 | ISO 8601 |
| Null handling | Initial values ('' / 0) | null | null | null | null |
| Enum representation | Domain values (coded) | Picklist refs (internalId) | OptionSet (integer) | Picklist (string name) | Lookup codes (varchar) |