File: write-skew-detection/ssi_database.py

Date: 2026-05-29

Time: 10:08

Purpose

This file implements Serializable Snapshot Isolation (SSI) — the concurrency control algorithm described in DDIA Chapter 7 that upgrades snapshot isolation to full serializability by detecting write skew at commit time. It owns the entire SSI lifecycle: MVCC versioned storage, transaction read/write tracking, and the three-pronged conflict detection (write-write, read-write, and phantom) that makes SSI work.

The key insight SSI addresses: plain snapshot isolation prevents dirty reads and lost updates but allows write skew, where two transactions read overlapping data, make decisions based on it, then write to *different* keys — each individually valid, but together violating an invariant. The classic example is two on-call doctors each checking "are there at least two doctors on call?" then removing themselves, leaving zero.

Key Components

SSITransaction

A transaction's tracking state. Holds:

The properties (readset, writeset, predicate_locks) return defensive copies, preventing external mutation of internal state.

SSIDatabase

The database engine. Core state:

DeletedSentinel / DELETED

Module-level singleton used as a tombstone value in the MVCC store. Compared by identity (is _DELETED), not equality.

Patterns

Optimistic Concurrency Control (OCC): Transactions buffer all writes locally in writes/deletes. No locks are taken during execution. Conflict detection happens entirely in commit(), and the transaction is either fully installed or fully aborted — no partial state.

MVCC (Multi-Version Concurrency Control): Each key stores a list of (timestamp, value, txid) tuples. visible_value scans this list to find the latest version at or before a snapshot timestamp. This gives each transaction a consistent point-in-time view without blocking writers.

Predicate Locks for Phantom Detection: Rather than locking individual rows, read_predicate stores the predicate function itself along with its result set. At commit time, the predicate is re-evaluated against current state. If the matching keys or values differ, a phantom conflict is raised. This is the mechanism that catches write skew scenarios involving range queries.

Dual Mode (Optimistic/Pessimistic): The pessimistic flag switches between SSI's standard optimistic approach (detect at commit) and eager detection (abort at read time via checkreadconflict). This lets the same codebase demonstrate both strategies.

Dependencies

Imports: None — this is a zero-dependency, self-contained module.

Imported by:

Flow

Transaction lifecycle

1. begintransaction() — allocates a unique txid and start_timestamp, returns an SSITransaction.

2. Readsread(tx, key) checks buffered writes/deletes first (read-your-own-writes), then falls back to visiblevalue for the MVCC snapshot. Tracks the key in readset and records a causal dependency on the writing transaction. In pessimistic mode, immediately checks for concurrent writes to this key.

3. Predicate readsread_predicate(tx, predicate) builds a full snapshot, merges in buffered writes, evaluates the predicate over all key-value pairs, stores the predicate + results for later phantom detection.

4. Writeswrite(tx, key, value) and delete(tx, key) buffer changes locally. writes and deletes are kept mutually exclusive per key.

5. commit(tx) — the critical path. Runs four checks in order:

If any conflicts are found, the transaction is aborted with a structured conflict list. Otherwise, writes are installed in _store with a new commit timestamp.

Conflict priority in conflictreason

When multiple conflict types exist simultaneously, the reason message follows a priority: write-write > phantom > read-write > constraint. This reflects severity — a write-write conflict is the most direct, while a constraint violation is a semantic check layered on top.

Invariants

Error Handling

Topics to Explore

Beliefs