What is the n + 1 problem in GraphQL?

TypeScript~5 min read

The n + 1 problem in GraphQL happens when fetching a list of N parent objects triggers N separate database queries for their children, plus the 1 initial parent query, totaling N + 1 round trips to the database instead of 2.

Theory

TL;DR

Think of it like calling 10 pizza shops one by one instead of placing a single bulk order at a warehouse.
A naive resolver for User.posts runs once per user in the list, not once per query.
10 users = 11 DB queries: 1 for users, 10 for their posts.
Fix rule: use DataLoader or ORM batching whenever a resolver returns child fields on a list.
Single-object queries are fine without batching; lists almost always need it.

Quick example

graphql

# This looks like one request to the client...
query {
  users {
    id
    name
    posts { title }  # ...but triggers N separate DB calls on the server
  }
}

typescript

// Naive resolver: the source of n + 1
const resolvers = {
  Query: {
    users: async () => db.users.findMany(),             // 1 query
  },
  User: {
    posts: async (parent) =>
      db.posts.findMany({ where: { userId: parent.id } }) // called once per user!
  }
};
// 10 users in the list: 1 + 10 = 11 queries total
// 100 users: 101 queries. The number scales linearly with list size.

GraphQL runs User.posts once for each user object in the list. There is no built-in batching. The result is 11 DB calls where 2 would be enough.

Why GraphQL resolvers do this

GraphQL.js executes resolvers depth-first. It fetches the parent list first via Query.users, then calls User.posts individually for each parent object during field resolution. Node.js async/await yields control between resolver calls, so each one hits the database independently. Without explicit batching, the ORM (Prisma, Sequelize) gets N separate queries instead of one WHERE id IN (1, 2, 3, ...).

This is the core gap: GraphQL's schema lets clients request nested data in a single HTTP call, but the server executes separate backend calls per list item by default.

When it matters

Single object query: no problem, just 1 child query runs.
List under 5 items: acceptable in most prototypes, simple code wins.
Paginated lists: always batch, even with first: 10 cursor pagination.
High-traffic API: DataLoader is not optional here.
Nested lists (posts, then comments per post): each level multiplies the problem.

How DataLoader fixes it

DataLoader is a batching and caching utility from Meta. Instead of hitting the DB per parent, it collects all requested child keys within one event loop tick, then fires a single batched query.

typescript

import DataLoader from 'dataloader';

// Batch function receives all collected IDs at once
const postLoader = new DataLoader(async (userIds: readonly number[]) => {
  const posts = await db.posts.findMany({
    where: { userId: { in: [...userIds] } }
  });
  // Return posts grouped by userId in the same order as input userIds
  return userIds.map(id => posts.filter(p => p.userId === id));
});

// Resolver now delegates to the loader
const resolvers = {
  User: {
    posts: (parent, _, { loaders }) => loaders.postLoader.load(parent.id)
  }
};
// Result: 1 users query + 1 batched posts query = 2 total, regardless of list size

DataLoader groups identical keys (userIds) into one batchFind call and memoizes results within the request lifecycle. The loader instance must live on the request context, not as a module-level singleton, so the cache does not leak between different users' requests.

Common mistakes

Mistake 1: assuming GraphQL nesting = one DB call

typescript

// Wrong assumption: one schema query = one database query
posts: (parent) => db.posts.find({ userId: parent.id })
// Reality: called N times, once per parent object in the list

Resolvers run independently per object. The schema feels like one query; the execution is not.

Mistake 2: batching only the top level

DataLoader on users -> posts does not protect posts -> author from n + 1. Each nesting level needs its own loader.

typescript

// postLoader handles users → posts correctly
// But Post.author still triggers n + 1 without a separate authorLoader
Post: {
  author: (parent, _, { loaders }) => loaders.authorLoader.load(parent.authorId)
}

Mistake 3: no pagination on list fields

graphql

users { posts { title } }  # 1000 users = 1001 queries, possible out-of-memory

Always add first: N or cursor-based pagination to list fields in production schemas.

Mistake 4: creating DataLoader at the wrong scope

typescript

// Wrong: module-level singleton shares cache across all incoming requests
const loader = new DataLoader(batchFn);

// Correct: new instance per request inside the context factory
const context = (req) => ({
  loaders: { postLoader: new DataLoader(batchFn) }
});

A shared singleton caches data across different users' requests, which is a data leak. Always create loaders inside the context factory function.

Mistake 5: ignoring nested n + 1

graphql

query {
  users {
    posts {
      comments { body }  # 10 posts x 5 comments each = 50 extra queries
    }
  }
}

Even with DataLoader on posts, the comments resolver still n + 1s without its own loader. For deeply nested schemas, Prisma's include can flatten the whole thing to 1 DB call, at the cost of always fetching all nested data whether the client asked for it or not.

Real-world usage

GitHub GraphQL API: DataLoaders batch repos -> stargazers calls across the graph.
Shopify Hydrogen: Prisma include flattens orders -> lineItems into a single query.
Hasura: built-in batching via Postgres CTEs, n + 1 handled at the engine level.
WPGraphQL: custom resolvers with WP_Query batching.
Decision point: ORM include for monolithic Prisma schemas; DataLoader for cross-service or microservice graphs.

Approach	DB calls	Setup cost	Best for
Naive resolvers	N + 1	None	Prototypes only
Prisma `include`	1	Low	Monolith + Prisma
DataLoader	2	Medium	Any Apollo server
Apollo Federation	Varies	High	Distributed graphs

Follow-up questions

Q: How does DataLoader know when to fire the batch query?
A: It collects all .load(key) calls made within the same event loop tick, then fires the batch function on the next tick. This is why it works transparently inside resolver chains without any manual coordination.

Q: What is the difference between the n + 1 problem and a waterfall?
A: N + 1 means N parallel-ish queries for sibling objects (all post queries for all users at the same level). A waterfall is sequential depth-first resolution (first posts, then comments for each post). Nested n + 1 combines both patterns.

Q: How do you detect n + 1 in production?
A: Apollo Studio's query tracing shows resolver timing and call counts. Prisma logs each SQL statement. If you see 11 nearly identical queries differing only in an ID parameter, that is n + 1.

Q: Does Apollo Federation prevent n + 1 between subgraphs?
A: The gateway batches top-level entity lookups, but each subgraph can still n + 1 internally. Each service needs its own DataLoader setup.

Q: Senior: how would you optimize a GraphQL endpoint serving 1M requests per day with deeply nested queries?
A: Denormalize hot paths by embedding child data directly in the parent document. Use persisted queries to avoid repeated parsing overhead. Add CDN caching for public queries. Apply DataLoader at every resolver level that resolves a list. Enforce query complexity limits to block arbitrarily deep nesting before it hits the database.

Examples

Basic: the problem with a list of orders

typescript

// Apollo Server + Prisma, naive implementation
const resolvers = {
  Query: {
    orders: () => prisma.order.findMany({ take: 10 }), // 1 query
  },
  Order: {
    items: (parent) =>
      prisma.orderItem.findMany({ where: { orderId: parent.id } }), // 10 queries!
  },
};
// Total with 10 orders: 11 DB calls, ~500ms latency on a real Postgres instance

With 10 orders the server fires 11 queries. Add pagination and forget DataLoader and you still pay N + 1 for every page load.

Intermediate: DataLoader in an Apollo Server context factory

typescript

// context.ts - loaders created fresh per request
import DataLoader from 'dataloader';
import { prisma } from './db';

export function createLoaders() {
  return {
    orderItemLoader: new DataLoader(async (orderIds: readonly string[]) => {
      const items = await prisma.orderItem.findMany({
        where: { orderId: { in: [...orderIds] } },
      });
      return orderIds.map(id => items.filter(i => i.orderId === id));
    }),
  };
}

// server.ts
const server = new ApolloServer({
  typeDefs,
  resolvers: {
    Query: {
      orders: () => prisma.order.findMany({ take: 10 }),
    },
    Order: {
      items: (parent, _, { loaders }) =>
        loaders.orderItemLoader.load(parent.id), // batched automatically
    },
  },
  context: ({ req }) => ({ loaders: createLoaders() }),
});
// Total: 2 DB queries regardless of how many orders are returned. Latency drops to ~50ms.

The key detail: createLoaders() is called once per request inside context, not once at server startup. Each request gets its own isolated cache.

Senior: nested n + 1 across three resolver levels

typescript

// Schema: User -> Post -> Comment (3 levels deep)
const resolvers = {
  Query: {
    users: () => db.users.findMany(),                             // 1 query
  },
  User: {
    posts: (parent, _, { loaders }) =>
      loaders.postLoader.load(parent.id),                         // batched: 1 query
  },
  Post: {
    comments: (parent, _, { loaders }) =>
      loaders.commentLoader.load(parent.id),                      // batched: 1 query
  },
};

// Without commentLoader this query:
// query { users { posts { comments { body } } } }
// = 1 + N(users) + N(posts) queries total
// With both loaders: always exactly 3 queries, regardless of data size

// Alternative with Prisma include (simpler but less flexible):
const users = await prisma.user.findMany({
  include: { posts: { include: { comments: true } } }
});
// 1 query total, but returns all nested data even if the client only asked for post titles

The Prisma include approach breaks down when the schema grows and different clients request different nesting depths. DataLoader per level gives more control, but it requires discipline: every new resolver that returns a list from a parent object needs its own loader. Skip one level and the problem comes back.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Finished reading?