How do you implement token bucket, sliding window, and backoff strategies for ERP APIs?
TL;DR
Bottom line: Use token bucket for bursty traffic (Salesforce, Oracle ERP Cloud), sliding window for sustained throughput (batch ETL), and exponential backoff with jitter as the universal retry mechanism when any ERP returns HTTP 429 or equivalent.
Key limit: Every ERP enforces different limits — Salesforce uses 24h rolling quotas (100K calls), NetSuite uses concurrent request caps (15 default), D365 F&O uses resource-based throttling with 5-minute sliding windows, and SAP uses fair-use throttling.
Watch out for: Fixed sleep intervals between retries — without jitter, parallel workers retry simultaneously and create thundering herd effects that amplify the original rate limit problem.
Best for: Any integration that must sustain high throughput against ERP APIs without triggering throttling or losing data.
Authentication: Rate limits are per-org (Salesforce), per-account (NetSuite), or per-user-per-server (D365) — your auth identity determines which quota pool you consume from.
System Profile
This card covers client-side rate limit management strategies applicable to all major ERP APIs. It addresses three complementary approaches: token bucket (proactive pacing), sliding window (quota tracking), and exponential backoff with jitter (reactive retry). These are client-side patterns your integration code implements — they do not override or bypass server-side ERP limits.
Client-side only: Token bucket and sliding window are client-side patterns — they pace your outbound requests but cannot increase server-side limits. If the ERP rejects a request, you must still handle the 429.
Shared state for distributed workers: Token bucket and sliding window require a shared counter store (Redis, database) when running multiple integration workers. Without shared state, each worker maintains independent counters and collectively exceeds the limit.
NetSuite concurrency is account-wide: NetSuite's 15-concurrent-request limit is shared across all integrations, all users, and all API surfaces (REST, SOAP, RESTlet). One greedy integration can block all others.
D365 limits are per-web-server: Dynamics 365 F&O tracks limits per user, per app ID, per web server. Your actual capacity is N web servers x 6,000 requests/5min, but you cannot control which web server handles your request.
Salesforce soft limits: Salesforce's daily API limit is a soft limit — temporary bursts above 100K may succeed but sustained overuse triggers hard blocking. Do not plan normal operations above the stated limit.
Retry-After is not universal: Only D365 F&O and SAP (via API Management) reliably return Retry-After headers. For Salesforce and NetSuite, you must implement your own backoff timing.
Integration Pattern Decision Tree
START — Integration hitting rate limits or needs proactive pacing
|
+-- What's your traffic pattern?
| |
| +-- Bursty (spikes followed by idle)
| | +-- Single worker?
| | | +-- YES -> Token Bucket (simple, handles bursts naturally)
| | | +-- NO -> Token Bucket + Redis (shared token store)
| | +-- ERP has daily rolling quota? (Salesforce)
| | +-- YES -> Token Bucket + Sliding Window for quota tracking
| | +-- NO -> Token Bucket alone is sufficient
| |
| +-- Sustained (steady high throughput)
| | +-- ERP has concurrency limit? (NetSuite)
| | | +-- YES -> Sliding Window with concurrency semaphore
| | | +-- NO -> Sliding Window for quota pacing
| | +-- Approaching daily quota?
| | +-- YES -> Calculate requests/second budget from remaining quota
| | +-- NO -> Process at full speed, monitor consumption
| |
| +-- Mixed (batch windows + real-time trickle)
| +-- Prioritize real-time requests
| +-- Batch jobs use token bucket with lower refill rate
| +-- Reserve 20% of quota for real-time operations
|
+-- How do you handle 429 responses?
| |
| +-- Retry-After header present? (D365, SAP APIM)
| | +-- YES -> Wait exact Retry-After seconds + small jitter (0-500ms)
| | +-- NO -> Exponential backoff with full jitter
| |
| +-- Is the request idempotent?
| | +-- YES -> Safe to retry with backoff
| | +-- NO -> Queue to dead letter, do NOT auto-retry
| |
| +-- Max retries exceeded?
| +-- YES -> Route to dead letter queue
| +-- NO -> Retry with backoff
|
+-- Multi-tenant integration?
+-- YES -> Per-tenant token buckets with tenant-level quotas
+-- NO -> Single bucket per ERP connection
Quick Reference
Scenario
Recommended Strategy
Configuration
Why
Salesforce REST API (Enterprise)
Token Bucket + quota monitor
Rate: 1.15 req/sec, burst: 50
Rolling 24h window, soft-limit tolerant of bursts
Salesforce Bulk API
Sliding Window
Track batches/24h, max 15K
Batch jobs are long-running, need quota pacing
NetSuite SuiteTalk/REST
Concurrency semaphore + backoff
Max 15 concurrent, exponential backoff on 429
Concurrency-based, not quota-based
D365 F&O OData
Backoff with Retry-After
Honor Retry-After header, max 6K/5min
Resource-based, Retry-After header is reliable
SAP S/4HANA OData
Token Bucket + Spike Arrest
Match API Management policy rate
Fair-use, configurable per API product
Oracle ERP Cloud REST
Token Bucket
Match gateway-configured limits
Per-tenant configurable limits
Multi-ERP integration
Per-ERP token buckets
Separate bucket per target ERP
Each ERP has different limit models
iPaaS (MuleSoft)
Gateway Rate Limiting policy
Configure in API Manager, >1 min windows
Built-in, cluster-aware, 429 auto-response
iPaaS (Boomi)
Flow Control shape
Limit parallel threads to ERP concurrency
Thread count maps to ERP concurrency cap
Step-by-Step Integration Guide
1. Identify Your ERP's Rate Limit Model
Before writing any code, determine how your target ERP enforces limits. Check the rate limit reference table above and the ERP's official documentation. [src2, src3, src4]
Salesforce's Sforce-Limit-Info header format is api-usage=<used>/<total> — parse both numbers. The header is only present on successful responses, not on 429 responses. [src2]
D365 F&O Retry-After is in seconds (integer), not milliseconds. Verify your HTTP client handles it correctly. [src4]
NetSuite SOAP faults use SSS_REQUEST_LIMIT_EXCEEDED as a status code string, not a numeric HTTP code. Your error handler must check both HTTP status and SOAP fault codes. [src3]
Error Handling & Failure Points
Common Error Codes
Code
ERP
Meaning
Retryable?
Resolution
HTTP 429
All
Rate limit exceeded
Yes
Exponential backoff with jitter
HTTP 403 + REQUEST_LIMIT_EXCEEDED
Salesforce
Daily quota exhausted
No (wait 24h)
Reduce volume or purchase more API calls
SSS_REQUEST_LIMIT_EXCEEDED
NetSuite
Concurrency cap reached
Yes (immediately)
Brief pause (0.5-1s) then retry
HTTP 429 + Retry-After
D365 F&O
User or resource limit
Yes
Wait exactly Retry-After seconds
HTTP 503
SAP
Service overloaded
Yes
Backoff with increasing delays
HTTP 429
Oracle ERP Cloud
Gateway throttle
Yes
Backoff, check gateway config
Failure Points in Production
Thundering herd after outage recovery: All queued workers retry simultaneously when ERP comes back online. Fix: Add full jitter to initial retry; stagger worker start times by random 0-30s offset. [src1]
Quota exhaustion mid-batch: Salesforce 24h rolling quota resets gradually, not all at once. Fix: Check remaining quota before batch; delay if remaining < batch_size * 1.2. [src2]
NetSuite concurrency consumed by UI users: 15-slot pool is shared with logged-in users. Fix: Limit integration to 10-12 slots; schedule heavy batch jobs outside business hours. [src3]
D365 per-web-server tracking mismatch: Cannot control which server handles requests. Fix: Target 80% of single-server capacity (4,800 req/5min). [src4]
Redis failure breaks distributed rate limiter: Workers lose shared state. Fix: Fall back to per-worker in-memory bucket at 1/N of total rate; implement Redis health check. [src5]
Anti-Patterns
Wrong: Fixed Sleep Between Retries
# BAD -- fixed 5-second sleep, all workers retry together
for attempt in range(5):
response = call_erp_api()
if response.status_code == 429:
time.sleep(5) # Thundering herd
continue
return response
Correct: Exponential Backoff with Full Jitter
# GOOD -- randomized increasing delays prevent thundering herd
for attempt in range(5):
response = call_erp_api()
if response.status_code == 429:
max_delay = min(60, 2 ** attempt)
time.sleep(random.uniform(0, max_delay))
continue
return response
Wrong: Ignoring Retry-After Header
// BAD -- calculating own backoff when server tells you when to retry
if (response.status === 429) {
await sleep(2000 * Math.pow(2, attempt)); // Ignores Retry-After
}
Correct: Honoring Retry-After with Small Jitter
// GOOD -- respect server's Retry-After, add small jitter
if (response.status === 429) {
const retryAfter = parseInt(response.headers['retry-after'] || '5', 10);
await sleep(retryAfter * 1000 + Math.random() * 500);
}
Wrong: No Quota Monitoring Until Failure
# BAD -- blindly sending until 403 hard block
for record in all_100k_records:
salesforce_api.update(record) # Hits 403 at record 95,001
Correct: Pre-flight Quota Check with Early Warning
# GOOD -- check quota before batch, abort early if insufficient
limits = salesforce_api.get_limits()
remaining = limits['DailyApiRequests']['Remaining']
if remaining < len(records) * 1.1:
raise QuotaInsufficientError(f"Need {len(records)} calls, {remaining} remaining")
for record in records:
sf_bucket.acquire()
salesforce_api.update(record)
Wrong: Unlimited Concurrency Against NetSuite
// BAD -- 50 parallel requests against 15-slot cap
const results = await Promise.all(
records.map(r => netsuiteApi.upsert(r)) // 35 get 429'd
);
Correct: Semaphore-Limited Concurrency
// GOOD -- limit to 12 concurrent (3 slots for other integrations)
const { Semaphore } = require('async-mutex');
const sem = new Semaphore(12);
const results = await Promise.all(
records.map(async (r) => {
const [, release] = await sem.acquire();
try { return await netsuiteApi.upsert(r); }
finally { release(); }
})
);
Common Pitfalls
Testing in sandbox with different limits: Salesforce sandbox has fewer user licenses, resulting in lower total quota. D365 sandbox may have fewer web servers. Fix: Load-test against sandbox with production-equivalent user license count. [src2]
Using deprecated user-based limits in D365: D365 F&O deprecated user-based limits in v10.0.36 — only resource-based limits remain. Fix: Update to handle resource-based throttling; remove per-user quota assumptions. [src4]
Not accounting for other integrations: Your integration is not the only quota consumer. ISV packages, other integrations, and UI calls all share the pool. Fix: Budget 60-70% of total quota for your integration. [src2]
Token bucket rate miscalculation: 100K / 86,400s = 1.15 req/sec but doesn't account for other consumers. Fix: Set rate to 0.7-0.8x theoretical max; use burst capacity for spikes only. [src5]
Backoff without max cap: Exponential backoff at attempt 10 = 1024 seconds. Fix: Cap at 30s for user-facing, 60s for batch, 5 min absolute max. [src1]
Jitter range too small: 100ms jitter with 50 workers is insufficient. Fix: Use full jitter (random 0 to max_delay) — simplest and most effective. [src1]
Integration sustains >50% of ERP's daily API quota
Integration makes < 100 API calls/day
Simple try/catch with fixed 5s retry
Multiple workers/processes call the same ERP
Single-threaded, sequential API calls
Basic exponential backoff without distributed state
Batch jobs approach daily quota limits
Real-time calls with <1s latency SLA
Circuit breaker with fast-fail
Multi-tenant integration serves multiple ERP orgs
Single-tenant with dedicated ERP instance
Per-tenant quotas without shared rate limiting
Integration runs during business hours alongside UI users
Off-hours batch window with exclusive API access
Maximum throughput without client-side limiting
Cross-System Comparison
Capability
Salesforce
NetSuite
D365 F&O
SAP S/4HANA
Oracle ERP Cloud
Limit Model
Rolling daily quota
Concurrency-based
Resource-based
Fair-use / configurable
Per-tenant configurable
Primary Limit
100K calls/24h
15 concurrent
6K req/5min/user/server
Via API Mgmt policy
Via API Gateway
429 Response
Yes
Yes (REST)
Yes + Retry-After
Yes (via APIM)
Yes
Retry-After
No
No
Yes
Yes (via APIM)
Varies
Quota Visibility
Yes (/limits)
Limited
No direct endpoint
APIM dashboard
OCI monitoring
Best Strategy
Token bucket + quota
Concurrency semaphore
Honor Retry-After
Match APIM policy
Match gateway config
Multi-Worker Concern
All share org quota
All share account slots
Per-user per server
Shared API product
Shared tenant quota
Burst Tolerance
Soft limit (temporary OK)
None (hard cap)
Resource-dependent
Policy-dependent
Gateway-dependent
Important Caveats
Rate limit numbers change with ERP releases and edition upgrades — verify against official documentation before each integration deployment cycle. Numbers verified as of March 2026.
Client-side rate limiting is a best-effort strategy — the ERP server is the ultimate arbiter. Always implement 429 handling even when using proactive pacing.
Sandbox environments may have different rate limits than production. Salesforce sandboxes use the same formula but typically have fewer user licenses.
Token bucket and sliding window require shared state (Redis or similar) for multi-worker deployments. Without shared state, workers operate independently and collectively exceed limits.
The "right" strategy depends on your traffic pattern, not the ERP. Token bucket for bursty, sliding window for sustained, backoff for reactive — most production integrations need all three.