Function: applyremotechange in multi-leader-replication/multi_leader.py

Date: 2026-05-29

Time: 14:16

applyremotechange — Multi-Leader Conflict Resolution

Purpose

This is the inbound replication handler for a multi-leader (multi-master) system. When a replica receives a write that originated on another node, this method decides whether the incoming change conflicts with local state and, if so, resolves the conflict using a pluggable strategy (last-write-wins or custom merge). It's the core of the convergence mechanism — without it, replicas would diverge permanently after concurrent writes.

Contract

Preconditions:

Postconditions:

Invariant: Calling applyremotechange twice with the same change dict is a no-op after the first application — the seen set enforces exactly-once semantics per (key, timestamp, nodeid) triple.

Parameters

| Parameter | Type | Description |

|-----------|------|-------------|

| change | dict | Replication log entry from another node. Must have key, timestamp, nodeid, value; istombstone defaults to False. |

| strategy | ConflictStrategy | How to resolve conflicts. LASTWRITEWINS picks the higher (timestamp, nodeid) tuple. CUSTOMMERGE delegates to merge_fn. |

| mergefn | Optional[Callable] | Required when strategy is CUSTOMMERGE. Receives (key, localvalue, remotevalue, localts, remotets) and returns the merged value. Ignored otherwise. |

Edge cases on change: Tombstoned values arrive with istombstone: True and value: None. The method stores the sentinel TOMBSTONE object internally but reports None in ConflictRecord.remote_value for tombstones — the caller sees deletion as None, not the sentinel.

Return Value

Algorithm

1. Dedup check. Look up (remotets, remotenode) in _seen[key]. If present, this is a replay — return None immediately. This is critical for ring topologies where changes circulate through every node.

2. Advance Lamport clock. Set clock = max(clock, remote_ts) + 1. This ensures causal ordering: any subsequent local write will have a timestamp strictly greater than the remote one.

3. Record seen. Mark this (ts, node) pair as processed for future dedup.

4. No local entry. If the key doesn't exist in store, accept unconditionally. Store the value (or TOMBSTONE), queue for propagation, return None.

5. Same-origin, same-timestamp. If local and remote have identical (origin, timestamp), this is the same write arriving via a different path — skip it.

6. Same origin, different timestamp. The remote node sent a newer version of its own write. This is a sequential update, not a conflict. Accept it if the remote (ts, node_id) tuple is lexicographically greater. Always propagate.

7. Different origins — conflict. Two nodes wrote to the same key concurrently. Resolve:

8. Log and return. Append the ConflictRecord to conflictlog. For LWW, forward the original change. For custom merge, the merged change was already appended in step 7. Return the record.

Side Effects

Error Handling

Usage Patterns

Typical caller is MultiLeaderCluster.sync(), which collects pending changes from all nodes, then applies them to destination nodes based on topology:


# All-to-all: every change goes to every other node
for change in pending_by_node[src_id]:
    for dst_id in self._node_order:
        if dst_id != src_id:
            self._nodes[dst_id].apply_remote_change(change, strategy, merge_fn)

Caller obligations:

Dependencies

Assumptions Not Enforced by Types

1. Timestamps are positive integers from Lamport clocks, not arbitrary values. Negative or zero timestamps would break the max + 1 advancement.

2. Node IDs are comparable strings — they're used in tuple comparison (ts, nodeid) for tiebreaking and in max(localorigin, remote_node) for canonical origin selection.

3. change dicts are not mutated after being passed in. The same dict may be forwarded to _pending and later applied to other nodes — mutation would corrupt the replication log.

4. Single-threaded execution. No synchronization on store, pending, seen, or clock.