File: vector-clocks/vector_clock.py

Date: 2026-05-29

Time: 09:04

vector-clocks/vector_clock.py

Purpose

This file implements vector clocks — the classic mechanism for tracking causal ordering in distributed systems (DDIA Chapter 5). It owns two responsibilities: (1) the vector clock data structure itself, which captures happens-before relationships between events across nodes, and (2) a VersionedKVStore that uses vector clocks to detect and manage write conflicts in a Dynamo-style key-value store.

Key Components

VectorClock

An immutable mapping of {node_id: counter}. Every mutation method returns a new instance rather than modifying in place.

The constructor strips zero-valued entries ({k: v for ... if v > 0}), so an empty dict and {"A": 0} produce identical clocks. This keeps equality and hashing consistent.

VersionedValue

A simple (value, vector_clock) pair — a value tagged with the causal context in which it was written.

VersionedKVStore

A per-node key-value store that maintains sibling lists — multiple concurrent versions of the same key.

find_conflicts(versions)

A utility that checks whether any pair of versions in a list are concurrent — i.e., whether the client needs to reconcile.

Patterns

Immutable value objects. VectorClock never mutates _clock; every operation returns a fresh instance. This avoids aliasing bugs that would be devastating in a distributed system where clocks are shared across messages and stored in multiple data structures.

Dynamo-style sibling resolution. The store doesn't pick a winner on conflict — it keeps all concurrent versions and pushes resolution to the caller via reconcile(). This matches Amazon Dynamo's design (and Riak's) where the application decides how to merge (e.g., shopping cart union).

History tracking. HistoryEntry records every mutation with its action type (write, siblingcreated, reconciled). This is purely for observability/testing — the store's correctness doesn't depend on it.

Dependencies

Imports: Only stdlib — dataclasses and typing. No external dependencies.

Imported by: testvectorclock.py and testertestvector_clock.py — the test suite and its meta-tests.

Flow

A typical lifecycle:

1. Client reads store.get("cart") → gets [VersionedValue(value="eggs", vc={A:1})]

2. Client writes store.put("cart", "eggs,milk", context=vc{A:1}) → store increments its clock to {A:2}, merges with context, drops the old version (dominated), stores new version.

3. Concurrent write on node B (no context from A's write) → storeb.put("cart", "eggs,bread") with vc={B:1}. When replicated to A, receive_replica finds {A:2} and {B:1} are concurrent → both survive as siblings.

4. Client reads → gets two siblings. find_conflicts() returns True.

5. Client reconcilesstore.reconcile("cart", "eggs,milk,bread", [vca, vcb]) → merges both clocks, increments, replaces siblings with one resolved value.

Invariants

Error Handling

Essentially none. The code trusts its callers:

Topics to Explore

Beliefs