A single runaway feature can be enough to bring your whole application to its knees. An export endpoint that consumes the entire connection pool, a batch job that monopolizes every available thread, an external call that drags on and leaves a queue of pending requests in its wake, and suddenly the entire application stops responding, even though only one component was actually in trouble. The problem isn’t the local failure: it’s that the failure spreads for lack of a bulkhead to contain it. The Bulkhead is precisely the pattern that puts those bulkheads in place.
This fifth and final installment of the resilience series closes out the arsenal. Where the previous patterns reacted to a failure (retrying, bounding a wait, breaking a circuit, throttling a rate), the Bulkhead takes a preventive stance: it partitions resources up front so that, the day something gives out, the damage stays contained.
The principle: watertight compartments
The term comes from the maritime world. On a ship, a bulkhead is a partition that divides the hull into watertight compartments. If one of them is breached and fills with water (or catches fire), the incident stays contained within that compartment; the others stay dry and the ship keeps floating. Without those bulkheads, the slightest breach would sink the whole vessel.
Transposed to software, the idea is exactly the same: limit the resources allocated to each functional component, so as to cap the maximum impact a degradation or heavy load can cause. Rather than letting every component draw without limit from a shared reservoir of threads, connections, or memory, you give each one its own quota. A component that saturates can then exhaust only its share, never that of the others.
This principle is more familiar than it might seem: you’ll find it, sometimes unnamed, in concepts that are widely used every day.
- thread pools: a bounded number of threads dedicated to a category of tasks.
- connection pools: a quota of connections reserved for a given database or service.
- semaphores: a primitive that caps the number of concurrent operations.
- bounded-capacity queues: a queue that absorbs spikes up to a threshold, then rejects the overflow rather than collapsing.
What the Bulkhead gives you
Partitioning isn’t just “putting a limit on things.” The pattern produces several concrete effects that reinforce one another.
Failure isolation
This is the founding benefit. By partitioning functional components, you guarantee that a local failure stays contained. If one of them goes down or degrades, the problem doesn’t spill over onto the rest of the system: the application as a whole keeps working despite one of its parts failing.
Controlled resource contention
By reserving resources dedicated to specific operations, you ensure that each one always has the minimum it needs. This is the safeguard against the classic scenario: a failing component that, in a loop, hogs connections and threads until it starves all the others. With a dedicated quota, its appetite is bounded by construction.
Concurrency under control
The heart of the pattern is capping the number of simultaneous operations per component. That cap naturally prevents global overloads: even under heavy traffic, the system stays responsive because no single component can, on its own, blow up the total load.
Better fault tolerance
By compartmentalizing the application, you increase its ability to absorb errors without collapsing. The risk of a cascading effect, that chain reaction where one failure triggers another, drops markedly, and the system’s overall resilience improves as a result.
Above all, the Bulkhead doesn’t conflict with any of the patterns seen previously: it combines with all of them. Retry, Timeout, Circuit Breaker, Rate Limiting: each protects against a particular failure mode, and the Bulkhead adds a cross-cutting layer of isolation on top.
In practice: partitioning with cockatiel
As with the Circuit Breaker, there’s no need to reimplement the mechanics by hand. The example below uses cockatiel to partition a networking service: we limit the number of simultaneous requests to 20, with a queue of 10 pending requests beyond that. Any call that would exceed both of those capacities is immediately rejected rather than queued indefinitely.
/**
* The Bulkhead pattern lets you allocate specific resources
* to certain operations, so that an overload of one operation
* (feature) doesn't affect the others.
*/
import { bulkhead, BulkheadPolicy, BulkheadRejectedError } from "cockatiel";
/**
* Here we define the component to partition. This is a simplistic case where
* interactions with the component are limited to a single service, but the
* Bulkhead could be shared across a module that groups together, for example,
* several services that use the same component.
*/
class SomeNetworkingService {
bulkheadPolicy: BulkheadPolicy;
constructor() {
/**
* Creating a Bulkhead that limits the number of
* simultaneous requests to 20, with 10 requests that can be
* queued.
*/
this.bulkheadPolicy = bulkhead(20, 10);
}
async call() {
console.log("bulkhead state", {
executionSlots: this.bulkheadPolicy.executionSlots,
waitingSlots: this.bulkheadPolicy.queueSlots,
});
try {
return await this.bulkheadPolicy.execute(() => {
// network call
});
} catch (exception) {
if (exception instanceof BulkheadRejectedError) {
// The Bulkhead limit was exceeded, the call is rejected
}
}
}
}
The value of this explicit rejection is worth highlighting: rather than letting a request pile up endlessly in an already saturated system, you fail fast and cleanly (BulkheadRejectedError). It’s an actionable signal: you can return a degraded response, trigger a fallback, or simply let the client retry later.
A variant with a semaphore and Effect
The same principle can be expressed at a lower level, starting from a semaphore, the synchronization primitive that caps the number of concurrent operations on a shared resource. The following example builds a small homemade Bulkhead with Effect, exposing several execution strategies: automatic queuing, immediate rejection if capacity is exceeded, or execution bounded by a timeout.
/**
* The Bulkhead pattern (with Effect) can be implemented using a
* Semaphore, which is a synchronization primitive that limits
* the number of concurrent operations performed on shared resources.
*/
import { Duration, Effect, Option, pipe, Queue } from "effect";
const makeBulkhead = (config: { capacity: number; waitCapacity: number }) =>
Effect.gen(function* () {
const semaphore = yield* Effect.makeSemaphore(config.capacity);
// We can use an in-memory "Queue" to manage operations
// waiting when the Bulkhead capacity is reached.
const queue = yield* Queue.bounded<Effect.Effect<any, any, any>>(
config.waitCapacity
);
return {
// with "withPermits", Effect automatically manages a wait queue
// (FIFO) for tasks that are waiting to be executed
// (maximum bulkhead capacity reached).
executeAsap: <A, E, R>(task: Effect.Effect<A, E, R>) =>
pipe(task, semaphore.withPermits(1)),
executeOrFail: <A, E, R>(task: Effect.Effect<A, E, R>) =>
pipe(
task,
semaphore.withPermitsIfAvailable(1),
Effect.flatMap(
Option.match({
onNone: () =>
Effect.fail(new Error("Bulkhead capacity exceeded")),
onSome: Effect.succeed,
})
)
),
executeAsapWithTimeout: <A, E, R>(
task: Effect.Effect<A, E, R>,
timeoutDuration: Duration.Duration
) => pipe(task, semaphore.withPermits(1), Effect.timeout(timeoutDuration)),
};
});
const program = Effect.gen(function* () {
const bulkhead = yield* makeBulkhead({ capacity: 10, waitCapacity: 20 });
const task1 = bulkhead.executeOrFail(task(Duration.seconds(1)));
const task2 = bulkhead.executeAsapWithTimeout(
task(Duration.seconds(5)),
Duration.seconds(2)
);
const task3 = bulkhead.executeAsap(task(Duration.seconds(5)));
yield* Effect.all([task1, task2, task3], {
concurrency: "unbounded",
mode: "either",
});
});
Effect.runFork(program);
The two approaches illustrate the same idea from two angles: a dedicated library that encapsulates all the mechanics, or a low-level primitive that you assemble yourself to fit your needs. In both cases, the contract stays the same: a bounded number of simultaneous executions, a capped wait queue, and explicit behavior when the limit is reached.
At distributed-system scale
So far, the partitioning has played out in-process, within a single application. But the Bulkhead changes scale without changing logic: it applies just as well to a distributed system, where the compartments become units of infrastructure.
- per-service access quotas: for example a cap of 50 req/s toward Service B, which prevents one consumer from saturating another.
- dedicated resource pools per destination, so that a slow service doesn’t drain the connections meant for the others.
- independent containers or pods in an orchestrator, which isolate failures down to the runtime level.
Yesterday’s software bulkhead becomes an infrastructure boundary, but the intent doesn’t move: contain the impact of a failure within its perimeter.
Conclusion
The Bulkhead doesn’t try to prevent failures: it accepts that they happen and simply makes sure they don’t spread. By giving each component its own resource quota, it turns a potential global failure into a local, circumscribed incident. It’s ultimately the same wisdom as a ship’s bulkheads: you don’t promise that nothing will ever breach the hull, you guarantee that the breach won’t sink the whole boat.
This pattern closes out a series of five. Retry to try again, Timeout to bound the wait, Circuit Breaker to break the domino effect, Rate Limiting to protect the server, and Bulkhead to partition resources: none is sufficient on its own, but combined intelligently, they give your applications the best chance of staying available and reliable, even in the worst situations, and therefore of staying accessible to your users. Resilience isn’t a patch you bolt on after the fact; it’s an intention you weave into the architecture, one pattern at a time.