Logical Time and Causal Ordering
In distributed systems, physical time is unreliable due to clock drift and network latency. To reason about the sequence of events, we use **logical time**, which focuses on causality—the "happened-before" relation—rather than wall-clock time.
1. The "Happened-Before" Relation ($\to$)
The relation$\to$defines a **strict partial order** on events in a system. Formally, for two events$a$and$b$:
1. **Process Order:** If$a$and$b$occur in the same process and$a$occurs before$b$, then$a \to b$.
2. **Communication Order:** If$a$is the sending of a message and$b$is the receipt of that message, then$a \to b$.
3. **Transitivity:** If$a \to b$and$b \to c$, then$a \to c$.
If$a \not\to b$and$b \not\to a$, then$a$and$b$are **concurrent** ($a \parallel b$).
2. Lamport Clocks (Scalar Clocks)
A Lamport clock is a simple monotonically increasing counter maintained by each process.
2.1 The Update Algorithm
Each process$P_i$maintains a local counter$L_i$.
1. **Before an event** (internal, send, or receive),$P_i$increments its clock:$$L_i = L_i + 1$$2. **When sending a message**,$P_i$includes its current$L_i$in the message payload.
3. **When receiving a message** with timestamp$L_{msg}$,$P_i$updates its clock:$$L_i = \max(L_i, L_{msg}) + 1$$### 2.2 Mathematical Property
The Lamport clock satisfies the **Clock Consistency Condition**:$$a \to b \implies L(a) < L(b)$$**Note:** The converse is NOT true. If$L(a) < L(b)$, we cannot conclude$a \to b$. They could be concurrent.
3. Vector Clocks
To detect concurrency (i.e., to make the clock condition a bidirectional implication), we use **Vector Clocks**.
3.1 The Vector Update Algorithm
In a system with$N$processes, each process$P_i$maintains a vector$V_i$of size$N$, where$V_i[j]$is$P_i$'s knowledge of the clock of process$P_j$.
1. **Before an internal event**,$P_i$increments its own component:$$V_i[i] = V_i[i] + 1$$2. **When sending a message**,$P_i$includes its entire vector$V_i$in the message.
3. **When receiving a message** with vector$V_{msg}$,$P_i$updates every element of its vector:$$V_i[j] = \max(V_i[j], V_{msg}[j]) \quad \text{for all } j$$And then increments its own component:$$V_i[i] = V_i[i] + 1$$### 3.2 Comparison and Concurrency Detection
For two vector timestamps$u$and$v$:
* **$u \le v$** if$u[i] \le v[i]$for all$i$.
* **$u < v$** if$u \le v$and there exists at least one$j$such that$u[j] < v[j]$.
* **$a \to b \iff V(a) < V(b)$**
**Concurrency Detection:**
Events$a$and$b$are concurrent ($a \parallel b$) if and only if$V(a) \not\le V(b)$and$V(b) \not\le V(a)$. In other words, the vectors are **incomparable**.
4. Practical Implications
4.1 Conflict Resolution in Databases
Dynamo-style databases (e.g., Riak, Cassandra) use vector clocks (or the optimized "Dotted Version Vectors") to detect concurrent writes to the same key. If two versions of an object have incomparable vector clocks, the system knows a conflict has occurred and can trigger "Sibling" resolution or manual reconciliation.
4.2 Causal Consistency
By attaching vector clocks to data, a system can ensure that a user never sees an "effect" before its "cause." For example, if a comment is a reply to a post, the reply will only be shown if the post is already visible in the local view.
5. Summary
* **Lamport Clocks:**$O(1)$space. Good for total ordering but cannot detect concurrency.
* **Vector Clocks:**$O(N)$ space. Mandatory for conflict detection and strong causal consistency.