Conflict Detection
Vector Clocks (VCs) are a logical clock mechanism used to determine the partial ordering of events and detect causal violations in distributed systems where no global synchronized clock exists.
Formal Structure
In a system of $n$ processes $\{P_1, P_2, \dots, P_n\}$, a Vector Clock $V$ is a vector of $n$ integers.
- **Initial State:** Each process $P_i$ starts with $V_i = [0, 0, \dots, 0]$.
- **Local Event:** Upon an internal event, $P_i$ increments its own component: $V_i[i] \leftarrow V_i[i] + 1$.
- **Message Send:** $P_i$ attaches its current $V_i$ to every outgoing message.
- **Message Receive:** Upon receiving $V_{msg}$, $P_j$ updates its clock: $V_j[k] \leftarrow \max(V_j[k], V_{msg}[k])$ for all $k$, then increments $V_j[j]$.
Mathematical Relationship
For two vector timestamps $V_A$ and $V_B$:
1. **Causality ($V_A \to V_B$):** $V_A$ happened-before $V_B$ if $V_A[i] \leq V_B[i]$ for all $i$, and there is at least one $j$ such that $V_A[j] < V_B[j]$.
2. **Concurrency ($V_A \parallel V_B$):** $V_A$ and $V_B$ are concurrent if neither $V_A \leq V_B$ nor $V_B \leq V_A$. This indicates a conflict that requires resolution.
Implementation: Sparse Vector Clocks
Storing a full vector of size $N$ is inefficient for large clusters. **Sparse Vector Clocks** use a Map structure to store only non-zero entries.
**Concrete Example (Pseudo-code):**
```python
def merge_clocks(local_map, remote_map):
merged = local_map.copy()
for node_id, version in remote_map.items():
merged[node_id] = max(merged.get(node_id, 0), version)
return merged
def is_conflict(v1, v2):
return not (v1.less_than_or_equal(v2) or v2.less_than_or_equal(v1))
```
Comparison: Lamport vs. Vector Clocks
| Metric | [Lamport Clocks](LamportClocks) | Vector Clocks |
|---|---|---|
| **Structure** | Single Integer | Vector of Integers |
| **Ordering** | Total Order | Partial Order |
| **Causality** | $e_1 \to e_2 \implies L(e_1) < L(e_2)$ | $V(e_1) < V(e_2) \iff e_1 \to e_2$ |
| **Conflict Detection** | Impossible | Possible |
| **Overhead** | $O(1)$ | $O(N)$ (Size of Cluster) |
Use Cases
1. **Conflict Detection:** DynamoDB and Riak use VCs to identify concurrent writes to the same key, allowing the application to resolve them (e.g., merging shopping carts).
2. **Causal Consistency:** Ensuring that a reply to a message is never seen before the original message.
3. **Distributed Snapshots:** Used in algorithms to capture a consistent global state of an asynchronous system.
Operational Limitations
- **Scalability:** The clock size grows linearly with the number of participating nodes.
- **Pruning:** In dynamic clusters, VCs can grow indefinitely as nodes join and leave. Systems often implement "VClock Pruning" (removing the oldest entries), which trades perfect causality tracking for bounded storage, introducing a risk of false concurrency detection.