File: leaderless-replication/dynamo.py

Date: 2026-05-29

Time: 12:55

Purpose

dynamo.py implements a Dynamo-style leaderless replication system — the core idea from Amazon's 2007 Dynamo paper and Chapter 5 of DDIA. It models a cluster of replica nodes where any node can accept reads and writes (no leader), and consistency is maintained through quorum protocols, read repair, anti-entropy, and hinted handoff. This is a teaching implementation: the "network" is simulated in-process via direct method calls, and node failure is modeled by flipping an availability flag.

Key Components

VersionedValue (dataclass)

A value tagged with a monotonic version number and the node that stored it. This is the unit of storage inside each replica.

ReadResult (dataclass)

Returned by DynamoCluster.get(). Carries the resolved value, its version, whether a conflict was detected (multiple distinct values at the same version), and how many replicas were repaired during the read.

ReplicaNode

A single node in the cluster. Owns:

Key contract on write(): a write is accepted only if the incoming version >= the current version for that key. This means last-writer-wins by version, not by timestamp.

DynamoCluster

The coordinator. Holds the N, W, R quorum parameters, a map of nodes, and a global per-key version counter (versioncounters). This counter is the single source of version assignment — it lives on the coordinator, not on individual replicas.

Patterns

Quorum reads/writes (W + R > N). The classic Dynamo invariant. The cluster is parameterized by (n, w, r) and the caller controls the tradeoff between availability and consistency.

Read repair. Every get() call opportunistically pushes the latest value to any replica that's behind. This is piggy-backed on the read path — no separate background process needed.

Anti-entropy repair. antientropyrepair() is a full-cluster background sweep: for each key, find the highest version across all available nodes and push it everywhere. This handles keys that haven't been read recently and thus haven't been repaired by read repair.

Hinted handoff. When sloppyquorum=True and a write succeeds but some nodes are down, the first available node stores a hint. deliverhints() replays these when the target comes back.

Simulated failure. Node availability is a boolean flag. The node's store is preserved when "down," so bringing it back up models a crash-recovery scenario (stale data, not empty data).

Dependencies

Minimal — only dataclasses and typing from the stdlib. No external libraries, no inter-module dependencies within the repo. Imported by the two test files (testdynamo.py, testdynamo_tester.py).

Flow

Write path (put)

1. Increment the global version counter for the key.

2. Fan out to all nodes — every available node gets the write.

3. Count acknowledgments. If ack_count < W, roll back the version counter and raise QuorumNotMet.

4. If sloppy quorum is enabled and some nodes are down, store hints on available_nodes[0].

5. Return the assigned version.

Read path (get)

1. Fan out reads to all nodes, collecting VersionedValue responses from available ones.

2. Check if the number of available nodes meets R. If not, raise QuorumNotMet.

3. Find the max version. Check for conflicts: if multiple distinct values exist at the max version, flag is_conflict=True and return all distinct values as a list.

4. Read repair: push the winning value to any replica with a stale or missing entry.

5. Return ReadResult.

Repair paths

Invariants

1. Version monotonicity. ReplicaNode.write() only accepts version >= current.version. Combined with the global counter in DynamoCluster, versions are strictly increasing per key across the cluster.

2. Quorum gating. Writes that don't reach W acks are rolled back (version counter decremented) and raise. Reads that don't have R available nodes raise. This enforces the quorum contract at the coordinator level.

3. Write fan-out is total. put() sends to every available node, not just W of them. The quorum check is on the ack count, not the send count. This maximizes consistency.

4. Read repair is eager. Every successful get() repairs all stale replicas it can reach — not just enough to meet a quorum.

5. Hints are single-homed. All hints for unavailable nodes go to available_nodes[0], regardless of how many nodes are available. This simplifies delivery but creates a hotspot.

6. Conflict detection is value-based. Two responses at the same max version with different values trigger is_conflict. The conflict is surfaced to the caller as a list, not silently resolved.

Error Handling

Exactly one exception type: QuorumNotMet. Raised by both put() and get() when the quorum constraint can't be satisfied. On a failed write, the version counter is rolled back to prevent version gaps. No other errors are raised — unavailable nodes silently return False/None and are skipped.

Topics to Explore

Beliefs