LoadBalancingStrategies
Load balancing is the distribution of network traffic across a pool of backend resources. It is implemented at either the Transport Layer (L4) or the Application Layer (L7).
L4 vs. L7 Load Balancing
| Feature | L4 (Transport) | L7 (Application) |
|---|---|---|
| **OSI Layer** | Layer 4 (TCP/UDP) | Layer 7 (HTTP/gRPC/TLS) |
| **Visibility** | IP, Port, Protocol | Path, Headers, Cookies, Body |
| **Performance** | High (No packet inspection) | Lower (Parsing overhead) |
| **Features** | Simple routing, NAT | Content-based routing, TLS termination |
| **Example** | AWS NLB, IPVS, HAProxy (TCP) | AWS ALB, Nginx, Envoy |
Selection Algorithms
1. **Round Robin:** Sequential assignment. Best for homogeneous backend capacity.
2. **Least Connections:** Assigns to the node with the fewest active sessions. Best for long-lived connections (e.g., WebSockets).
3. **Consistent Hashing:** Maps requests to nodes using a hash ring.
- **Math:** A request with key $k$ is assigned to node $n = \text{argmin}_{i} (\text{hash}(n_i) \geq \text{hash}(k))$.
- **Benefit:** Minimizes cache invalidation when nodes join/leave; only $1/N$ of keys are remapped.
Health Check Mechanics
A load balancer must proactively prune unhealthy nodes from its rotation.
- **Liveness:** Is the process running? (e.g., TCP port open).
- **Readiness:** Is the application ready to serve? (e.g., `/healthz` returns 200 after cache warm-up).
**Failure Thresholds:**
- `interval`: Time between probes (e.g., 5s).
- `unhealthy_threshold`: Consecutive failures before removal (e.g., 3).
- `healthy_threshold`: Consecutive successes before re-entry (e.g., 2).
Advanced Patterns
1. TLS Termination vs. Passthrough
- **Termination:** Decrypt at the LB. Reduces CPU load on backends and centralizes certificate management.
- **Passthrough:** Forward encrypted packets. Required for end-to-end encryption (e.g., HIPAA compliance).
2. Draining (Graceful Shutdown)
When a node is marked for removal, the LB stops sending *new* requests but allows *in-flight* requests to complete before closing the connection. Mandatory for zero-downtime deployments.
3. Sticky Sessions (Session Affinity)
Ensures a client is routed to the same backend for the duration of a session.
- **Mechanism:** Injected cookies (L7) or Client IP hashing (L4).
- **Anti-pattern:** Use shared state (Redis/DB) instead of sticky sessions to enable better horizontal scaling.
Common Failure Modes
- **Aggressive Probing:** Health checks consuming significant backend CPU/bandwidth.
- **Zombie Backends:** No health checks configured, leading to black-holed traffic when nodes crash.
- **Herding Effect:** When a node returns to the pool, the "Least Connections" algorithm floods it with traffic, causing an immediate re-failure. Use **Slow Start** ramps.