.load(id) calls within one execution tick into a single bulk query, turning N+1 queries into 2.new DataLoader(keys => batchFetchByIds(keys)) (JS) or DataLoader(load_fn=batch_load) (Python Strawberry)batch: false) in production — it defeats the entire purpose of DataLoader| # | Cause | Likelihood | Signature | Fix |
|---|---|---|---|---|
| 1 | No DataLoader — each resolver calls DB individually | ~50% of cases | N separate SELECT ... WHERE id = ? in query logs | Wrap in DataLoader with WHERE id IN (?) batch query |
| 2 | DataLoader created globally, not per-request | ~15% of cases | Stale/wrong data returned; data leaks between users | Move DataLoader instantiation into context factory |
| 3 | Batch function returns results in wrong order | ~10% of cases | Data assigned to wrong parent objects | Map results to input keys: keys.map(k => resultMap.get(k)) |
| 4 | Batch function returns wrong array length | ~8% of cases | DataLoader must return array of same length error | Return null for missing keys; match array length to keys |
| 5 | Using .loadMany() where .load() suffices | ~5% of cases | Bypasses per-key deduplication within a tick | Use .load() in resolvers; .loadMany() only for explicit bulk |
| 6 | Nested DataLoaders not batching across depth levels | ~5% of cases | Batch size = 1 for deeply nested fields | Ensure async resolvers; check batchScheduleFn timing |
| 7 | Complex key objects not deduplicating | ~4% of cases | Duplicate DB queries despite same logical key | Provide cacheKeyFn that serializes keys to strings |
| 8 | DataLoader cache hiding database updates | ~3% of cases | Mutations not reflected in subsequent queries | Call loader.clear(key) or loader.clearAll() after mutations |
START
|-- Is your GraphQL server in JavaScript/TypeScript?
| |-- YES --> Use `dataloader` npm package. See "JS/TS: Apollo Server" code example.
| +-- NO |
|-- Is your server in Python with Strawberry?
| |-- YES --> Use `strawberry.dataloader.DataLoader`. See "Python: Strawberry" code example.
| +-- NO |
|-- Is your server in Python with Graphene?
| |-- YES --> Use `aiodataloader` package. See "Python: Graphene" code example.
| +-- NO |
|-- Is your server in Go (gqlgen)?
| |-- YES --> Use `graph-gophers/dataloader` or gqlgen built-in dataloaden.
| +-- NO |
|-- Is your server using Apollo Federation?
| |-- YES --> Apply DataLoader in each subgraph's reference resolver.
| +-- NO |
+-- DEFAULT --> Implement batch-and-cache pattern manually:
collect keys during resolve phase, batch-fetch before returning.
Enable query logging in your database or GraphQL server to see repeated queries. Look for patterns where the same SELECT statement runs N times with different IDs. [src1]
-- You'll see this pattern in logs (N+1):
SELECT * FROM posts; -- 1 query
SELECT * FROM users WHERE id = 1; -- N queries (one per post)
SELECT * FROM users WHERE id = 2;
SELECT * FROM users WHERE id = 3;
-- ... repeated for every post.authorId
Verify: Count queries per GraphQL request in your DB logs. If count > (unique tables queried * 2), you likely have N+1.
Install the appropriate DataLoader package for your language. [src2]
# JavaScript/TypeScript
npm install dataloader
# Python (Strawberry — built-in, no extra install)
pip install strawberry-graphql
# Python (Graphene)
pip install aiodataloader
Verify: npm list dataloader → [email protected] (or later)
The batch function receives an array of keys and must return a Promise/awaitable resolving to an array of results in the same order as the keys. [src2]
// JavaScript — batch loading function
async function batchUsers(userIds) {
// One query fetches ALL requested users
const users = await db.query(
'SELECT * FROM users WHERE id = ANY($1)',
[userIds]
);
// CRITICAL: return results in same order as input keys
const userMap = new Map(users.map(u => [u.id, u]));
return userIds.map(id => userMap.get(id) || null);
}
Verify: batchUsers([3, 1, 2]) returns [user3, user1, user2] — same order as input.
Attach fresh DataLoader instances to the GraphQL context so every resolver in a single request shares the same loader (batching + dedup) but different requests get isolated caches. [src1] [src3]
// Apollo Server 4 — context factory
const server = new ApolloServer({ typeDefs, resolvers });
const { url } = await startStandaloneServer(server, {
context: async () => ({
loaders: {
user: new DataLoader(batchUsers),
post: new DataLoader(batchPosts),
},
}),
});
Verify: Add console.log('context created') in the factory — it should log once per HTTP request.
Replace direct database calls in resolvers with loader.load(key). [src1]
const resolvers = {
Query: {
posts: () => db.query('SELECT * FROM posts'),
},
Post: {
// BEFORE (N+1): db.query('SELECT * FROM users WHERE id = $1', [post.authorId])
// AFTER (batched):
author: (post, _args, { loaders }) => loaders.user.load(post.authorId),
},
};
Verify: Run the query and check DB logs — you should see exactly 1 SELECT * FROM posts + 1 SELECT * FROM users WHERE id = ANY(...) instead of N+1 queries.
After any create/update/delete operation, clear the relevant DataLoader cache to prevent stale reads within the same request. [src2]
const resolvers = {
Mutation: {
updateUser: async (_, { id, input }, { loaders }) => {
const updated = await db.query(
'UPDATE users SET name = $1 WHERE id = $2 RETURNING *',
[input.name, id]
);
// Clear stale cache entry and prime with fresh data
loaders.user.clear(id);
loaders.user.prime(id, updated[0]);
return updated[0];
},
},
};
Verify: After mutation, subsequent resolvers in the same request return updated data.
// Input: GraphQL query { posts { id title author { id name } } }
// Output: 2 DB queries total (1 for posts, 1 for all authors) instead of N+1
import { ApolloServer } from '@apollo/server';
import { startStandaloneServer } from '@apollo/server/standalone';
import DataLoader from 'dataloader';
const batchUsers = async (ids: readonly number[]) => {
const users = await db.query('SELECT * FROM users WHERE id = ANY($1)', [ids]);
const map = new Map(users.rows.map((u: any) => [u.id, u]));
return ids.map(id => map.get(id) || null);
};
const resolvers = {
Post: {
author: (post: any, _: any, ctx: any) => ctx.loaders.user.load(post.author_id),
},
};
const server = new ApolloServer({ typeDefs, resolvers });
const { url } = await startStandaloneServer(server, {
context: async () => ({
loaders: { user: new DataLoader(batchUsers) },
}),
});
# Input: GraphQL query { posts { id title author { id name } } }
# Output: 2 DB queries total instead of N+1
import strawberry
from strawberry.dataloader import DataLoader
from typing import List, Optional
async def load_users_batch(keys: List[int]) -> List[Optional['User']]:
rows = await db.fetch('SELECT * FROM users WHERE id = ANY($1)', keys)
user_map = {row['id']: User(id=row['id'], name=row['name']) for row in rows}
return [user_map.get(key) for key in keys]
@strawberry.type
class Post:
id: int
title: str
author_id: int
@strawberry.field
async def author(self, info: strawberry.Info) -> Optional[User]:
return await info.context["user_loader"].load(self.author_id)
async def get_context():
return {"user_loader": DataLoader(load_fn=load_users_batch)}
# Input: GraphQL query { posts { id title author { id name } } }
# Output: 2 DB queries total instead of N+1
from aiodataloader import DataLoader
import graphene
class UserLoader(DataLoader):
async def batch_load_fn(self, user_ids):
users = {u.id: u for u in await User.objects.filter(id__in=user_ids)}
return [users.get(uid) for uid in user_ids]
class PostType(graphene.ObjectType):
id = graphene.Int()
title = graphene.String()
author = graphene.Field(lambda: UserType)
async def resolve_author(self, info):
return await info.context['user_loader'].load(self.author_id)
// BAD — shared across all requests, leaks user A's data to user B
const globalUserLoader = new DataLoader(batchUsers);
const resolvers = {
Post: {
author: (post) => globalUserLoader.load(post.authorId),
},
};
// GOOD — fresh instance per request, isolated cache
const server = new ApolloServer({ typeDefs, resolvers });
startStandaloneServer(server, {
context: async () => ({
loaders: { user: new DataLoader(batchUsers) },
}),
});
// BAD — results don't match key order, causes data corruption
async function batchUsers(ids) {
const users = await db.query('SELECT * FROM users WHERE id = ANY($1)', [ids]);
return users.rows; // Order depends on DB, NOT on input key order!
}
// GOOD — explicitly map to input key order
async function batchUsers(ids) {
const users = await db.query('SELECT * FROM users WHERE id = ANY($1)', [ids]);
const map = new Map(users.rows.map(u => [u.id, u]));
return ids.map(id => map.get(id) || null);
}
// BAD — each post triggers a separate DB query (N+1)
const resolvers = {
Post: {
author: async (post) => {
return await db.query('SELECT * FROM users WHERE id = $1', [post.authorId]);
},
},
};
// GOOD — DataLoader batches all author lookups into one query
const resolvers = {
Post: {
author: (post, _, { loaders }) => loaders.user.load(post.authorId),
},
};
// BAD — loadMany for a single key is wasteful and bypasses per-key dedup
const resolvers = {
Post: {
author: (post, _, { loaders }) =>
loaders.user.loadMany([post.authorId]).then(r => r[0]),
},
};
// GOOD — load() for resolvers, loadMany() only for explicit batch
const resolvers = {
Query: {
users: (_, { ids }, { loaders }) => loaders.user.loadMany(ids),
},
Post: {
author: (post, _, { loaders }) => loaders.user.load(post.authorId),
},
};
DataLoader must be constructed with a function which accepts Array<key> and returns Promise<Array<value>> error. Fix: ensure you return null for missing keys — keys.map(k => map.get(k) || null). [src2].load() calls for that key return the cached error. Fix: call loader.clear(key) in your error handler, or use new DataLoader(fn, { cache: false }). [src2]{id: 1} objects are different references, so DataLoader treats them as different keys. Fix: provide cacheKeyFn: (key) => key.id or serialize to string. [src2]await before calling .load(), the batching tick may have already fired. Fix: collect all .load() calls synchronously within a resolver, or use Promise.all(). [src1]__resolveReference runs once per entity key, recreating N+1 at the subgraph boundary. Fix: use DataLoader in __resolveReference resolvers too. [src3]loader.clear(key) after writes, optionally loader.prime(key, newValue). [src2]IN (...) clauses that exceed DB limits. Fix: set maxBatchSize: 100. [src2]# Check if DataLoader is installed (Node.js)
npm list dataloader
# Expected: [email protected]
# Check for N+1 in PostgreSQL logs (enable statement logging first)
# In postgresql.conf: log_statement = 'all'
grep 'SELECT.*FROM users WHERE id =' /var/log/postgresql/postgresql.log | wc -l
# If count >> number of unique IDs, you have N+1
# Enable Apollo Server query logging to detect N+1
DEBUG=knex:query node server.js
# Check DataLoader batch sizes at runtime (add to batch function)
# async function batchUsers(ids) {
# console.log(`DataLoader batch size: ${ids.length}`);
# }
# Python: check aiodataloader version
pip show aiodataloader
# Expected: aiodataloader >= 0.2.0
| Version | Status | Breaking Changes | Migration Notes |
|---|---|---|---|
| dataloader 2.2.x (JS) | Current (2024) | None | Stable since 2.0.0 |
| dataloader 2.0.0 (JS) | LTS | Dropped Node < 10; TypeScript rewrite | Update Node.js; types now built-in |
| dataloader 1.x (JS) | EOL | — | API compatible; bump version |
| strawberry.dataloader (Python) | Current (2025) | None | Built into strawberry-graphql >= 0.100.0 |
| aiodataloader 0.2.x (Python) | Current (2024) | None | For Graphene; requires asyncio |
| graphql-batch (Ruby) | Current (2024) | None | Shopify's alternative; different API |
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Nested GraphQL resolvers fetch related entities by ID | Single resolver fetches one record | Direct database query |
| Same entity is requested multiple times in one query | Data changes between resolver calls within one request | Disable cache or clear after mutation |
| Federated subgraph resolves entity references | You need cross-request persistent caching | Redis/Memcached layer |
| Any list-to-detail resolver pattern (posts → authors) | Query complexity is bounded and small (< 5 items) | Direct fetch is fine for small N |
| You want to decouple data fetching from schema structure | ORM already handles eager loading (e.g., Rails includes) | Use ORM's built-in batching |
cache: false) means duplicate keys within the same tick will appear multiple times in the batch function's keys array.process.nextTick (Node.js) or microtask scheduling by default. Custom batchScheduleFn can change timing but may break batching if the window is too short.