Suggest an edit

Improve this article

Refine the answer for “What is Microservices architecture?”. Your changes go to moderation before they’re published.

Approval required

Content

What you’re changing

Title (EN)

Short answer (EN)

Shown above the full answer for quick recall.

Answer (EN)

**Microservices architecture** is a design pattern where an application is built as a collection of small, independently deployable services, each responsible for a specific business function and owning its own database.

## Theory

### TL;DR

- Each service does one thing and deploys without touching the others
- Services communicate over HTTP/REST, gRPC, or message queues like Kafka or RabbitMQ
- Every service owns its own database - sharing databases between services is an antipattern
- The hard part is not building services, it is handling failures in a distributed system
- Conway's Law: your microservices will mirror your org chart, whether you plan it or not

### Quick Example

```javascript
// Order Service - handles only order logic
// Runs on port 3001, has its own PostgreSQL instance

app.post('/orders', async (req, res) => {
  const { userId, productId, quantity } = req.body;

// Write to OWN database only
  const order = await OrderDB.create({ userId, productId, quantity });

// Notify Inventory Service via message queue, not a direct DB call
  await kafka.send('order.created', { orderId: order.id, productId, quantity });

res.json({ orderId: order.id });
});
```

The Order Service never touches the Inventory database directly. It publishes an event and the Inventory Service reacts to it. That decoupling is the entire idea.

### Monolith vs Microservices

A monolith runs everything in one process. The same codebase handles users, products, orders, and payments. You deploy all of it at once. When the payment module has a bug, you redeploy the entire application. When one team pushes a broken change, everyone is blocked.

Microservices split that monolith along business boundaries. User Service, Order Service, and Payment Service are separate processes, separate deployments, and separate databases. A crash in Payment Service does not take down Orders. A deployment in User Service does not require a release meeting with three other teams.

The trade-off is real. You gain isolation and independent scaling, but you add network hops, distributed tracing, and eventual consistency to every interaction that was previously a simple local function call.

### Service Communication

Two main models:

**Synchronous** (REST, gRPC): Service A calls Service B and waits for a response. Simple to understand, but if B is slow or down, A hangs too.

**Asynchronous** (Kafka, RabbitMQ, SQS): Service A publishes an event and moves on. Service B processes it when ready. Higher resilience, but with eventual consistency - you cannot immediately confirm the downstream result.

gRPC is faster than REST for internal service-to-service calls (binary protocol, HTTP/2). Many teams use REST for public APIs and gRPC internally.

### Database Per Service

This is not a suggestion. If two services share a database, they are not truly independent. A schema change in one service can break the other. You cannot scale them separately. You cannot replace one service's database technology without affecting the other team's code.

In practice: Order Service gets its own PostgreSQL, Inventory Service might use MongoDB, Payment Service uses a separate PostgreSQL instance with strict transaction guarantees. Each team owns their schema completely.

The side effect: no ACID transactions across service boundaries. That problem is solved by the Saga pattern, where each service performs a local transaction and publishes an event, and compensating transactions handle rollback when something fails.

### Key Infrastructure Patterns

**API Gateway:** the single entry point for external clients. It routes requests, handles authentication, rate limiting, and request aggregation. Kong, AWS API Gateway, and NGINX are common choices.

**Service Discovery:** how does Service A know where Service B is running? Consul or Kubernetes handles this automatically. Services register themselves, and others find them by name rather than hardcoded IP addresses.

**Circuit Breaker:** when Service B keeps failing, the Circuit Breaker stops Service A from hammering it with requests. After a failure threshold is crossed, it opens the circuit and returns a fallback response immediately. Resilience4j is standard for Java; `opossum` is a solid choice for Node.js.

### When Microservices Make Sense

Honestly: not at the start. Most successful microservices systems were extracted from a working monolith, not designed as microservices from day one. I have seen teams spend more hours chasing network timeouts across twelve services than they would have spent maintaining a well-structured monolith for the same product.

Microservices pay off when:
- Multiple teams need to deploy independently without coordinating releases
- Different services have wildly different scaling needs (user auth vs. video transcoding)
- You want technology flexibility: ML pipeline in Python, core API in Go, frontend in Node
- The monolith has genuinely become hard to change and slow to build

They add overhead when the team is small, the product is still finding its shape, or nobody on the team has experience running distributed systems in production.

### Common Mistakes

**Mistake 1: Sharing a database between services**

```javascript
// WRONG: Two services pointing at the same database
// UserService reads directly from the orders table - OrderService's domain

const orders = await db.query(
  'SELECT * FROM orders WHERE user_id = ?', [userId]
);
// UserService now depends on OrderService's schema
// Any migration in OrderService can break UserService silently
```

**Mistake 2: Long synchronous call chains**

```javascript
// WRONG: A calls B, B calls C, C calls D
const user    = await userService.getUser(userId);      // 100ms
const product = await productService.getProduct(id);   // 100ms
const price   = await pricingService.getPrice(id);     // 100ms
// If pricingService is down, the entire order request fails
// Latency compounds: 300ms minimum, and that is the happy path
```

Fan out with message queues instead of chaining synchronous calls.

**Mistake 3: No API versioning**

When you update Service B's API without versioning, every caller breaks on the next deploy. Running `/v1/orders` alongside `/v2/orders` lets teams migrate at their own pace without synchronized releases.

**Mistake 4: Skipping distributed tracing**

With ten services, debugging one slow request means chasing it across all of them. Set up Jaeger, Zipkin, or Datadog APM from day one. Adding it retroactively after incidents is much more painful.

### Real-world Usage

- **Netflix:** over 700 microservices. One of the earliest at this scale. Built Hystrix, Eureka, and Ribbon because those tools did not exist yet.
- **Amazon:** early 2000s internal mandate by Jeff Bezos that all teams expose data through APIs. That organizational constraint is what made AWS possible.
- **Uber:** moved from a monolith to microservices during global scaling. Later ran into coordination problems with hundreds of services and shifted toward a domain-oriented model.
- **Kubernetes + Docker:** the standard deployment layer today. Each service runs in a container; Kubernetes handles scaling, health checks, and service discovery.

### Follow-up Questions

**Q:** What is the Saga pattern and when do you use it?
**A:** Saga is a sequence of local transactions where each step publishes an event that triggers the next service. If a step fails, compensating transactions undo the previous steps. You use it when a business operation spans multiple services and needs rollback behavior, which you cannot get from a distributed ACID transaction.

**Q:** How is microservices different from SOA?
**A:** SOA typically relies on a central Enterprise Service Bus and heavyweight SOAP protocols. Microservices use lightweight HTTP or gRPC APIs, avoid a central broker, and each service is smaller in scope with its own database. SOA services were often large; microservices are meant to be small enough for one team to fully own.

**Q:** Can two services share one database to keep things simple?
**A:** They can, but then they are not truly independent. Any schema change requires a coordinated deployment. You lose the ability to scale storage per service, and a slow query from one service degrades the other. The database-per-service rule exists to prevent this coupling from forming in the first place.

**Q:** How do you handle authentication across services?
**A:** The API Gateway validates JWTs once and passes user identity downstream via request headers. Each service trusts the gateway and does not repeat authentication logic. Some teams also run a dedicated Auth Service that issues short-lived tokens for service-to-service calls.

**Q:** A monolith is working fine and the team has 5 developers. Should you migrate to microservices?
**A:** No. The operational costs are real: separate CI/CD pipelines, distributed tracing, eventual consistency, and inter-service failure handling. These costs only pay off when monolith coordination becomes the actual bottleneck, not as a proactive architectural choice.

## Examples

### Splitting a Feature Out of a Monolith

```javascript
// BEFORE: Everything in one Express app
app.post('/orders', async (req, res) => {
  const user    = await db.users.findById(req.body.userId);
  const product = await db.products.findById(req.body.productId);
  const order   = await db.orders.create({ userId: user.id, productId: product.id });
  await db.inventory.decrement(req.body.productId);
  res.json(order);
});

// AFTER: Order Service only handles orders
// Inventory lives in a completely separate service
app.post('/orders', async (req, res) => {
  // Fetch user from User Service via HTTP
  const userRes = await fetch(`http://user-service/users/${req.body.userId}`);
  const user    = await userRes.json();

// Write to OWN database only
  const order = await orderDB.create({ userId: user.id, productId: req.body.productId });

// Publish event - Inventory Service handles the decrement
  await kafka.publish('order.created', {
    productId: req.body.productId,
    quantity:  req.body.quantity
  });

res.json(order);
});
```

After the split, the Order team deploys independently. No coordination with Inventory or Product teams to ship a change. Each service can also be scaled separately: if order volume spikes, you scale Order Service without touching anything else.

### Circuit Breaker with Fallback

```javascript
const CircuitBreaker = require('opossum');

async function callPaymentService(payload) {
  const res = await fetch('http://payment-service/pay', {
    method: 'POST',
    body: JSON.stringify(payload),
    headers: { 'Content-Type': 'application/json' }
  });
  return res.json();
}

const paymentBreaker = new CircuitBreaker(callPaymentService, {
  timeout: 3000,                 // fail if response takes more than 3s
  errorThresholdPercentage: 50,  // open circuit after 50% failure rate
  resetTimeout: 30000            // try again after 30s
});

paymentBreaker.fallback(() => ({
  status: 'pending',
  message: 'Payment queued for retry'
}));

app.post('/checkout', async (req, res) => {
  const result = await paymentBreaker.fire(req.body);
  res.json(result);
});
```

When Payment Service goes down, the circuit opens after the threshold is crossed. Subsequent requests return the fallback instantly instead of waiting 3 seconds each to time out. Orders keep flowing; the payment is retried later. This is the exact problem Netflix built Hystrix to solve.

### Saga Pattern for Cross-Service Transactions

```javascript
// Order Service - starts the saga
async function createOrderSaga({ userId, productId, quantity, orderId }) {
  // Step 1: request inventory reservation
  await kafka.send('inventory.reserve', { productId, quantity, orderId });
  // Step 2 fires when 'inventory.reserved' event arrives (Inventory Service)
  // Step 3 fires when 'payment.charged' event arrives (Payment Service)
}

// Inventory Service reacts to reservation request
kafka.on('inventory.reserve', async ({ productId, quantity, orderId }) => {
  const available = await inventoryDB.checkStock(productId, quantity);

if (available) {
    await inventoryDB.reserve(productId, quantity);
    await kafka.send('inventory.reserved', { orderId });
  } else {
    // Compensating event - Order Service will cancel
    await kafka.send('inventory.reserve.failed', { orderId, reason: 'Out of stock' });
  }
});

// Order Service listens for failure and compensates
kafka.on('inventory.reserve.failed', async ({ orderId, reason }) => {
  await orderDB.cancel(orderId, reason);
  await kafka.send('order.cancelled', { orderId });
});
```

Each service does its local work and publishes an event. The saga handles the rollback chain if something fails. No distributed lock, no two-phase commit, no shared transaction coordinator.

Markdown · drag & drop images · ⌘B / ⌘I shortcuts1845 words

For the reviewer

Note to the moderator (optional)

Visible only to the moderator. Helps review go faster.