Graceful Degradation: Architectural Resilience

Graceful Degradation is the ability of a system to maintain its core mission-critical functions even when one or more non-essential components fail or are under extreme load.

1. The Fallback Hierarchy

When a dependency fails, the system must choose a secondary behavior:

Cached Fallback: Return the last known good value from a local cache (e.g., Redis).
- Example: If the Personalization service is down, show a cached list of "Popular Items" instead of a blank page.
Static Fallback: Return a hardcoded default.
- Example: Default to a standard $10.00 shipping fee if the Dynamic-Shipping-Estimator times out.
Stubbed Fallback: Return an empty set or a "Service Temporarily Unavailable" message for that specific UI widget.

2. Circuit Breakers (Fail-Fast)

To prevent a slow dependency from exhausting the entire system's thread pool, use a Circuit Breaker (e.g., Resilience4j or Istio).

Closed: Normal operation.
Open: After a threshold of errors (e.g., 50% failures), the circuit "opens." Subsequent calls fail immediately without hitting the backend.
Half-Open: After a cooldown (e.g., 30s), the system allows a few test requests to see if the dependency has recovered.

3. Load Shedding and Throttling

When throughput exceeds capacity, the system must prioritize traffic.

Priority Queues: Ensure "Checkout" requests are processed before "Add to Wishlist" requests.
Throttling: Return 429 Too Many Requests to non-critical clients (e.g., internal analytics crawlers) to save capacity for end-users.

4. Implementation with Feature Flags

Use Feature Flags (e.g., LaunchDarkly) to manually or automatically "kill" expensive features during a traffic spike.

Concrete Scenario: During a Black Friday surge, disable the "Related Products" recommendation engine to reduce database load by 30%, keeping the "Buy Now" button functional.

See Also:

Game Day Exercises — Testing degradation in production.
Incident Management — Managing the fallout of component loss.
Auto Scaling Strategies — Scaling to avoid degradation.