Suggest an editImprove this articleRefine the answer for “What is the n + 1 problem in GraphQL?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**The n + 1 problem** in GraphQL happens when fetching a list of N parent objects triggers N separate database queries for their children, plus 1 initial query, totaling N + 1 DB round trips. ```typescript // 10 users = 11 queries: 1 for users list, 10 for their posts User: { posts: (parent) => db.posts.findMany({ where: { userId: parent.id } }) } ``` **Key fix:** DataLoader batches all child queries into one call, regardless of list size.Shown above the full answer for quick recall.Answer (EN)Image**The n + 1 problem** in GraphQL happens when fetching a list of N parent objects triggers N separate database queries for their children, plus the 1 initial parent query, totaling N + 1 round trips to the database instead of 2. ## Theory ### TL;DR - Think of it like calling 10 pizza shops one by one instead of placing a single bulk order at a warehouse. - A naive resolver for `User.posts` runs once per user in the list, not once per query. - 10 users = 11 DB queries: 1 for users, 10 for their posts. - Fix rule: use DataLoader or ORM batching whenever a resolver returns child fields on a list. - Single-object queries are fine without batching; lists almost always need it. ### Quick example ```graphql # This looks like one request to the client... query { users { id name posts { title } # ...but triggers N separate DB calls on the server } } ``` ```typescript // Naive resolver: the source of n + 1 const resolvers = { Query: { users: async () => db.users.findMany(), // 1 query }, User: { posts: async (parent) => db.posts.findMany({ where: { userId: parent.id } }) // called once per user! } }; // 10 users in the list: 1 + 10 = 11 queries total // 100 users: 101 queries. The number scales linearly with list size. ``` GraphQL runs `User.posts` once for each user object in the list. There is no built-in batching. The result is 11 DB calls where 2 would be enough. ### Why GraphQL resolvers do this GraphQL.js executes resolvers depth-first. It fetches the parent list first via `Query.users`, then calls `User.posts` individually for each parent object during field resolution. Node.js `async/await` yields control between resolver calls, so each one hits the database independently. Without explicit batching, the ORM (Prisma, Sequelize) gets N separate queries instead of one `WHERE id IN (1, 2, 3, ...)`. This is the core gap: GraphQL's schema lets clients request nested data in a single HTTP call, but the server executes separate backend calls per list item by default. ### When it matters - Single object query: no problem, just 1 child query runs. - List under 5 items: acceptable in most prototypes, simple code wins. - Paginated lists: always batch, even with `first: 10` cursor pagination. - High-traffic API: DataLoader is not optional here. - Nested lists (posts, then comments per post): each level multiplies the problem. ### How DataLoader fixes it DataLoader is a batching and caching utility from Meta. Instead of hitting the DB per parent, it collects all requested child keys within one event loop tick, then fires a single batched query. ```typescript import DataLoader from 'dataloader'; // Batch function receives all collected IDs at once const postLoader = new DataLoader(async (userIds: readonly number[]) => { const posts = await db.posts.findMany({ where: { userId: { in: [...userIds] } } }); // Return posts grouped by userId in the same order as input userIds return userIds.map(id => posts.filter(p => p.userId === id)); }); // Resolver now delegates to the loader const resolvers = { User: { posts: (parent, _, { loaders }) => loaders.postLoader.load(parent.id) } }; // Result: 1 users query + 1 batched posts query = 2 total, regardless of list size ``` DataLoader groups identical keys (userIds) into one `batchFind` call and memoizes results within the request lifecycle. The loader instance must live on the request context, not as a module-level singleton, so the cache does not leak between different users' requests. ### Common mistakes **Mistake 1: assuming GraphQL nesting = one DB call** ```typescript // Wrong assumption: one schema query = one database query posts: (parent) => db.posts.find({ userId: parent.id }) // Reality: called N times, once per parent object in the list ``` Resolvers run independently per object. The schema feels like one query; the execution is not. **Mistake 2: batching only the top level** DataLoader on `users -> posts` does not protect `posts -> author` from n + 1. Each nesting level needs its own loader. ```typescript // postLoader handles users → posts correctly // But Post.author still triggers n + 1 without a separate authorLoader Post: { author: (parent, _, { loaders }) => loaders.authorLoader.load(parent.authorId) } ``` **Mistake 3: no pagination on list fields** ```graphql users { posts { title } } # 1000 users = 1001 queries, possible out-of-memory ``` Always add `first: N` or cursor-based pagination to list fields in production schemas. **Mistake 4: creating DataLoader at the wrong scope** ```typescript // Wrong: module-level singleton shares cache across all incoming requests const loader = new DataLoader(batchFn); // Correct: new instance per request inside the context factory const context = (req) => ({ loaders: { postLoader: new DataLoader(batchFn) } }); ``` A shared singleton caches data across different users' requests, which is a data leak. Always create loaders inside the context factory function. **Mistake 5: ignoring nested n + 1** ```graphql query { users { posts { comments { body } # 10 posts x 5 comments each = 50 extra queries } } } ``` Even with DataLoader on `posts`, the `comments` resolver still n + 1s without its own loader. For deeply nested schemas, Prisma's `include` can flatten the whole thing to 1 DB call, at the cost of always fetching all nested data whether the client asked for it or not. ### Real-world usage - GitHub GraphQL API: DataLoaders batch `repos -> stargazers` calls across the graph. - Shopify Hydrogen: Prisma `include` flattens `orders -> lineItems` into a single query. - Hasura: built-in batching via Postgres CTEs, n + 1 handled at the engine level. - WPGraphQL: custom resolvers with `WP_Query` batching. - Decision point: ORM `include` for monolithic Prisma schemas; DataLoader for cross-service or microservice graphs. | Approach | DB calls | Setup cost | Best for | |---|---|---|---| | Naive resolvers | N + 1 | None | Prototypes only | | Prisma `include` | 1 | Low | Monolith + Prisma | | DataLoader | 2 | Medium | Any Apollo server | | Apollo Federation | Varies | High | Distributed graphs | ### Follow-up questions **Q:** How does DataLoader know when to fire the batch query? **A:** It collects all `.load(key)` calls made within the same event loop tick, then fires the batch function on the next tick. This is why it works transparently inside resolver chains without any manual coordination. **Q:** What is the difference between the n + 1 problem and a waterfall? **A:** N + 1 means N parallel-ish queries for sibling objects (all post queries for all users at the same level). A waterfall is sequential depth-first resolution (first posts, then comments for each post). Nested n + 1 combines both patterns. **Q:** How do you detect n + 1 in production? **A:** Apollo Studio's query tracing shows resolver timing and call counts. Prisma logs each SQL statement. If you see 11 nearly identical queries differing only in an ID parameter, that is n + 1. **Q:** Does Apollo Federation prevent n + 1 between subgraphs? **A:** The gateway batches top-level entity lookups, but each subgraph can still n + 1 internally. Each service needs its own DataLoader setup. **Q:** Senior: how would you optimize a GraphQL endpoint serving 1M requests per day with deeply nested queries? **A:** Denormalize hot paths by embedding child data directly in the parent document. Use persisted queries to avoid repeated parsing overhead. Add CDN caching for public queries. Apply DataLoader at every resolver level that resolves a list. Enforce query complexity limits to block arbitrarily deep nesting before it hits the database. ## Examples ### Basic: the problem with a list of orders ```typescript // Apollo Server + Prisma, naive implementation const resolvers = { Query: { orders: () => prisma.order.findMany({ take: 10 }), // 1 query }, Order: { items: (parent) => prisma.orderItem.findMany({ where: { orderId: parent.id } }), // 10 queries! }, }; // Total with 10 orders: 11 DB calls, ~500ms latency on a real Postgres instance ``` With 10 orders the server fires 11 queries. Add pagination and forget DataLoader and you still pay N + 1 for every page load. ### Intermediate: DataLoader in an Apollo Server context factory ```typescript // context.ts - loaders created fresh per request import DataLoader from 'dataloader'; import { prisma } from './db'; export function createLoaders() { return { orderItemLoader: new DataLoader(async (orderIds: readonly string[]) => { const items = await prisma.orderItem.findMany({ where: { orderId: { in: [...orderIds] } }, }); return orderIds.map(id => items.filter(i => i.orderId === id)); }), }; } // server.ts const server = new ApolloServer({ typeDefs, resolvers: { Query: { orders: () => prisma.order.findMany({ take: 10 }), }, Order: { items: (parent, _, { loaders }) => loaders.orderItemLoader.load(parent.id), // batched automatically }, }, context: ({ req }) => ({ loaders: createLoaders() }), }); // Total: 2 DB queries regardless of how many orders are returned. Latency drops to ~50ms. ``` The key detail: `createLoaders()` is called once per request inside `context`, not once at server startup. Each request gets its own isolated cache. ### Senior: nested n + 1 across three resolver levels ```typescript // Schema: User -> Post -> Comment (3 levels deep) const resolvers = { Query: { users: () => db.users.findMany(), // 1 query }, User: { posts: (parent, _, { loaders }) => loaders.postLoader.load(parent.id), // batched: 1 query }, Post: { comments: (parent, _, { loaders }) => loaders.commentLoader.load(parent.id), // batched: 1 query }, }; // Without commentLoader this query: // query { users { posts { comments { body } } } } // = 1 + N(users) + N(posts) queries total // With both loaders: always exactly 3 queries, regardless of data size // Alternative with Prisma include (simpler but less flexible): const users = await prisma.user.findMany({ include: { posts: { include: { comments: true } } } }); // 1 query total, but returns all nested data even if the client only asked for post titles ``` The Prisma `include` approach breaks down when the schema grows and different clients request different nesting depths. DataLoader per level gives more control, but it requires discipline: every new resolver that returns a list from a parent object needs its own loader. Skip one level and the problem comes back.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.