Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Bulkhead Pattern Medium

The bulkhead pattern is inspired by the sectioned partitions (bulkheads) of a ship’s hull. If one section is breached, only that section floods — the rest of the ship stays afloat. In software, the pattern isolates elements of an application into pools so that if one fails, the others continue to function.

By partitioning resource access (e.g. connection pools, goroutine pools, or semaphores), the bulkhead pattern prevents a single failing component from consuming all resources and cascading into a system-wide outage.

Implementation

Below is a Bulkhead that limits concurrent access to a downstream service using a buffered channel as a semaphore.

package bulkhead

import (
	"errors"
	"time"
)

var (
	ErrBulkheadFull = errors.New("bulkhead capacity full")
)

// Bulkhead limits the number of concurrent calls to a function.
type Bulkhead struct {
	sem     chan struct{}
	timeout time.Duration
}

// New creates a Bulkhead with the given maximum concurrent capacity and a
// timeout for acquiring a slot.
func New(capacity int, timeout time.Duration) *Bulkhead {
	return &Bulkhead{
		sem:     make(chan struct{}, capacity),
		timeout: timeout,
	}
}

// Execute runs fn if a slot is available within the configured timeout.
// If the bulkhead is full it returns ErrBulkheadFull without executing fn.
func (b *Bulkhead) Execute(fn func() error) error {
	select {
	case b.sem <- struct{}{}:
		defer func() { <-b.sem }()
		return fn()
	case <-time.After(b.timeout):
		return ErrBulkheadFull
	}
}

Usage

orderBulkhead := bulkhead.New(10, 1*time.Second)
paymentBulkhead := bulkhead.New(5, 1*time.Second)

// The order service is isolated from the payment service.
// If payments exhaust their 5 slots, orders can still proceed
// with their independent pool of 10.
err := orderBulkhead.Execute(func() error {
	return orderService.Place(order)
})

err = paymentBulkhead.Execute(func() error {
	return paymentService.Charge(order)
})

if errors.Is(err, bulkhead.ErrBulkheadFull) {
	log.Println("service is at capacity, try again later")
}

Rules of Thumb

  • Size each bulkhead based on the downstream service’s capacity and expected latency.
  • Combine with the circuit breaker pattern: a bulkhead limits concurrency while a circuit breaker stops calls to an already-failing service.
  • Monitor bulkhead rejection rates — a consistently full bulkhead indicates the pool is undersized or the downstream is too slow.