← Back to Blog
Resilience

Circuit Breaker pattern: a shield against cascading failure

The Circuit Breaker, the circuit breaker of a resilient architecture: a CLOSED/OPEN/HALF-OPEN state machine to stop cascading degradation between services.

📅 ✍️ Antoine Coulon
resiliencecircuit-breakerdistributed-systemsfault-tolerancenodejs

When an external service fails, the real risk isn’t its outage, it’s that it drags your entire system down with it. A call that stops responding means connections piling up, threads blocked, an event loop saturated, and step by step an entire application collapsing. The Circuit Breaker is the pattern that prevents this domino effect.

This third installment of the resilience series naturally extends the previous two: if your retries (episode 1) and your timeouts (episode 2) keep firing in a loop without ever succeeding, that’s a sign the service on the other end is probably down. And that it’s time to stop knocking on a closed door.

The principle: a circuit breaker for your services

You know the electrical circuit breaker: when a device fails, it cuts the circuit to protect the installation. The Circuit Breaker applies exactly this idea to software. A faulty device on the electrical grid is the equivalent of a failing third-party service in your distributed architecture: if you don’t isolate it, it puts everything else at risk.

Concretely, the Circuit Breaker sits like a proxy between your system and the external service. It monitors the calls that pass through it and decides, based on their result, whether to let them through or cut them off immediately. It’s quite literally the circuit breaker of a resilient architecture: when a service starts overheating, you cut it off before everything goes up in smoke.

A three-position state machine

The heart of the pattern is a state machine. At any given moment, the breaker is in one of these three states:

StateBehavior
CLOSEDEverything works normally: requests pass through to the service.
OPENThe service is considered down: requests are blocked immediately, without even attempting the call.
HALF-OPENA transition state: a few calls are allowed through, in a controlled and limited way, to test whether the service is back.

This mechanism brings three direct benefits:

Automatic transitions

The whole point of the pattern is that these state changes are handled for you, based on thresholds you define:

In practice with Node.js

There’s no need to reimplement this state machine yourself: the pattern is very widely supported by mature libraries. The example below uses cockatiel to protect a database call. We configure a breaker that opens after 5 consecutive failures and stays open for 10 seconds before attempting a recovery.

/**
 * The Circuit Breaker pattern protects our application
 * against the domino effect following the failure of an external service.
 */
import {
  ConsecutiveBreaker,
  ExponentialBackoff,
  retry,
  handleAll,
  circuitBreaker,
  wrap,
} from "cockatiel";

/**
 * Here we define a Circuit Breaker that stops calling the executed
 * function for 10 seconds if it fails 5 times consecutively.
 * There are other types of Breakers, which let you refine
 * and adjust the resilience strategy.
 */
const breaker = circuitBreaker(handleAll, {
  halfOpenAfter: 10 * 1000,
  breaker: new ConsecutiveBreaker(5),
});

const dbCallWithBreakerAndTimeout = () =>
  breaker.execute(() =>
    knex.select().from("books").timeout(10_000, { cancel: true })
  );

dbCallWithBreakerAndTimeout()
  .then(() => {
    // [...]
  })
  .catch(() => {
    // [...]
  });

/**
 * We can even combine the Circuit Breaker with a Retry + Timeout pattern
 * to make the initial call more resilient before triggering
 * the Circuit Breaker's behavior.
 */
const retryPolicy = retry(handleAll, {
  maxAttempts: 3,
  backoff: new ExponentialBackoff(),
});

const breakerWithRetry = wrap(retryPolicy, breaker);

const fullyResilientDbCall = () =>
  breakerWithRetry.execute(() =>
    knex.select().from("books").timeout(10_000, { cancel: true })
  );

The value of that last block is worth highlighting: the Circuit Breaker doesn’t oppose the patterns covered earlier, it complements them. By wrapping the breaker with a retry policy and a timeout (wrap), you make each call more resilient before even considering opening the circuit. The breaker then only opens when, despite the attempts, the service remains durably unavailable: exactly the signal we want to detect.

A pattern available everywhere

Good news: this pattern is so widespread that there’s a battle-tested implementation in practically every ecosystem, which makes it easy to adopt:

Conclusion

The Circuit Breaker doesn’t repair a failing service, that’s not its job. It does something more valuable: it prevents that failure from becoming yours. By isolating a failing dependency, it protects your resources, relieves the struggling service, and orchestrates a gradual recovery once the storm has passed.

Combined with the retries and timeouts from the previous episodes, it forms the foundation of a coherent resilience strategy. Take the initiative: better to trip a circuit on purpose than to watch the whole system go up in smoke.