Eventual Consistency
In an eventually consistent system, if writes stop, all replicas eventually return the same value. Between writes, replicas may diverge.
Eventual consistency trades correctness guarantees for availability and performance. Used well, it powers high-scale systems. Used poorly, it leads to subtle bugs.
What "eventual" means
Eventual consistency does NOT mean:
- Replicas converge quickly
- Reads return the latest write
- Consistent ordering of writes
- Any specific bound on inconsistency
It means: if you stop writing, eventually replicas will agree.
In practice, "eventually" might be milliseconds, seconds, or longer. Depends on the system.
Why eventual consistency
CAP theorem: choose two of consistency, availability, partition tolerance.
Distributed systems must tolerate partitions. So choose availability or consistency.
Eventually consistent systems choose A: stay available during partitions, reconcile after.
Strongly consistent systems choose C: become unavailable rather than diverge.
Both have valid use cases.
The consistency spectrum
Strong → weak:
Linearizability
Strongest. Operations appear to take effect at a single instant between invocation and completion.
If client A writes X=1, then client B reads X, B sees X=1.
Sequential consistency
Operations appear in some sequential order, consistent with each client's order.
Less strict than linearizability (no real-time bounds).
Causal consistency
Operations causally related are seen in same order; concurrent operations may differ.
If A writes X then writes Y, all replicas see X before Y. But concurrent A's-X and B's-Z can be ordered differently.
Read-your-writes
A client always sees its own writes.
Monotonic reads
Once a client reads value V, it never sees an earlier version.
Eventual consistency
Replicas converge if writes stop. No order or timing guarantees.
Session guarantees
Often combined with eventual consistency:
Read-your-writes
Within a session, see your own writes.
Monotonic reads
Within session, never go backward.
Monotonic writes
Within session, your writes apply in order.
Writes-follow-reads
Writes after reads happen after the reads.
These improve UX without strong consistency.
Implementation patterns
Master-slave replication
One master accepts writes; slaves replicate asynchronously.
Reads from slaves are eventually consistent.
Multi-master replication
Multiple nodes accept writes. Conflict resolution required:
- Last-write-wins
- Vector clocks
- CRDTs
Quorum reads / writes
R + W > N: linearizable
R + W ≤ N: eventually consistent
Tunable consistency in DynamoDB, Cassandra, Riak.
Anti-entropy
Replicas exchange states periodically; resolve differences.
Background sync ensures convergence.
Gossip protocols
Information spreads through random pairwise exchanges. Used for membership, state propagation.
What can go wrong
Read-after-write divergence
User updates profile, then reads it. May see old version.
UX nightmare without read-your-writes.
Lost updates
Two writers update concurrently; one overwrites the other.
Not visible until much later.
Inconsistent views
Different users see different states. Tickets show "available" to one, "sold out" to another.
Reordering
Events appear in different orders across replicas. Causally related events may invert.
Stale reads at scale
In a system with many replicas, stale reads are common in normal operation.
Application design
Embrace it
Some operations are naturally eventually consistent: counters of likes, caches, search indices.
Mitigate it
For operations needing stronger guarantees:
- Pin reads to master
- Wait after writes (read-your-writes)
- Use stronger consistency mode
- Use CRDTs to avoid conflicts
Idempotent operations
Operations that can be replayed safely. Critical for retry safety.
Compensation
Detect inconsistency post-hoc; reconcile. Common in financial systems.
Examples in real systems
DNS
Eventually consistent with TTLs. Updates take time to propagate.
Works because DNS doesn't need strong consistency.
Cassandra
Tunable consistency. Operators choose R/W/N values.
DynamoDB
Eventually consistent reads by default; strongly consistent reads optional (more expensive).
MongoDB
Replica sets with primary-replica replication. Reads can be tuned.
S3
Eventually consistent originally; now strong read-after-write for new objects.
Git
Distributed version control. Each clone is a replica. Merge required to reconcile.
CDNs
Cache propagation. Content updates take time to spread.
Search engines
Indices are eventually consistent with the source. Recent changes don't appear immediately.
When eventual consistency works
Append-only data
Logs, events, immutable data. No update conflicts.
Counters
If precision isn't critical (likes, views).
Caches
Stale acceptable; refresh periodically.
Search
Slight delay in indexing acceptable.
Notifications
Message ordering may be relaxed.
Replication for availability
Reading from replica acceptable; replica may be slightly stale.
When it doesn't work
Money
Financial transactions need strong consistency for balances.
Inventory
Selling more than you have is a real problem.
Authentication
User logged in / not logged in must be consistent.
Coordination
Distributed locks, leader election require consensus.
Critical configuration
Wrong config values cause real problems.
Common failure patterns
Pretending it's strong consistency
Building application as if reads are immediately consistent. Bugs appear at scale.
Hidden eventual consistency
Library or framework behavior not understood. Surprises in production.
Insufficient session guarantees
User experience suffers without read-your-writes.
Using LWW where CRDT needed
Last-write-wins is simple but loses data.
Ignoring monitoring
Replication lag is a key metric. Without monitoring, problems compound.
Wrong consistency model for the operation
Some operations need strong; others tolerate eventual. Mix appropriately.
Operating eventually consistent systems
Monitor replication lag
How far behind are replicas? Spike means trouble.
Test inconsistency
Inject delays in test environment. Verify application handles staleness.
Observability
Distinguish "missing" from "stale" in logs and dashboards.
Backups
Inconsistent state may mean inconsistent backups.
Practical advice
For application developers:
1. Understand which operations need strong consistency
2. Use stronger modes selectively for those
3. Design rest of app for eventual consistency
4. Test with replication delays
5. Add session guarantees where UX matters
For architects:
1. Identify consistency requirements per use case
2. Choose systems matching requirements
3. Don't over-promise consistency
4. Plan for divergence and reconciliation
Further Reading
- [CrdtDataStructures](CrdtDataStructures) — Tools for managing
- [ByzantineFaultTolerance](ByzantineFaultTolerance) — Different reliability problem
- [Distributed Systems Hub](DistributedSystemsHub) — Cluster index