Date: 2026-05-29
Time: 08:25
In database buffer management, the STEAL policy allows the buffer manager to evict dirty pages from uncommitted transactions to disk — "stealing" the page frame for another use. This is great for memory efficiency (you don't need to pin every page a long-running transaction touches), but it creates a problem: if the system crashes, the on-disk data file contains partial writes from transactions that never committed. Those writes must be undone.
ARIES handles this with before-images in the log: every modification records what the data looked like *before* the change, so recovery can reverse uncommitted work. This is roughly half of ARIES's complexity.
This codebase sidesteps the entire problem by never writing uncommitted data to the persistent data store.
Look at the operation types in write-ahead-log/wal.py:10-13:
OP_PUT = 1
OP_DELETE = 2
OP_COMMIT = 3
OP_CHECKPOINT = 4
There is no OPUNDO, no OPCOMPENSATE, no before-image field. Each WALRecord (wal.py:17-22) stores key and value — that's the new value only. There is no old_value field. The log is structurally incapable of undoing anything.
Recovery works by replaying forward: appendbatch (wal.py:148-163) writes all operations atomically with a COMMIT marker, and replay (referenced in tests at testwal.py:22-25) only returns records that form complete, committed groups. Uncommitted operations simply aren't replayed.
The snapshot isolation implementation in snapshot-isolation/mvccdatabase.py is the clearest example. The entire version store is in-memory (mvccdatabase.py:53):
self._versions = {} # key -> list[Version]
When a transaction writes, it appends a Version object to an in-memory list (mvccdatabase.py:144-150). If the transaction aborts, its txid goes into self.aborted (mvccdatabase.py:55), and the visibility check at mvcc_database.py:80-81 filters it out:
if created_by in self._aborted:
return False
There is no disk page to undo because uncommitted data never reached disk. This is a NO-STEAL policy by construction — the "buffer pool" is the entire in-memory dict, and it's never partially flushed.
The SSI implementation (write-skew-detection/ssi_database.py:12) keeps transaction writes in a per-transaction buffer:
self._writes = {} # key -> value (buffered writes)
An abort (ssi_database.py:298-300) just marks the transaction status — the buffered writes are discarded with the transaction object. The underlying data store never saw them.
The B-tree's WAL (b-tree-storage-engine/btree.py:119-170) serves a different purpose than ARIES: it provides atomic multi-page writes (e.g., during a node split), not transaction rollback. The recover method (btree.py:150-168) replays page writes forward and then truncates the log. There's no concept of undoing a page write — either the whole operation committed (pages synced, WAL truncated via commit at btree.py:139-145) or it didn't (WAL replayed on restart).
The NO-STEAL approach is simpler, but it has a cost: all dirty data from active transactions must fit in memory. A long-running transaction that modifies millions of rows must keep all those modifications buffered. ARIES's STEAL policy lets the buffer manager evict those pages to disk under memory pressure, at the cost of needing undo logging for crash recovery.
For these reference implementations — which are teaching tools with in-memory data stores — that trade-off is irrelevant. For a production database handling terabyte-scale transactions, it's the reason ARIES exists.
snapshot-isolation/mvccdatabase.py:isvisible — The visibility rules that make abort "free" — understanding why filtering by aborted is sufficient without disk undob-tree-storage-engine/btree.py:recover — Compare this redo-only page recovery with ARIES's three-phase recovery (analysis, redo, undo)write-ahead-log/wal.py:append_batch — How atomic commit markers eliminate the need for undo by making uncommitted batches invisible to replayaries-three-pass-recovery — ARIES's analysis/redo/undo phases and why the undo phase disappears under NO-STEALwrite-skew-detection/ssi_database.py — How per-transaction write buffering interacts with abort semantics when no data reaches diskwal-has-no-before-images — WALRecord contains only the new value (key, value) with no old_value field, making undo structurally impossible from the log alonemvcc-uncommitted-data-memory-only — In MVCCDatabase, uncommitted transaction writes exist only as in-memory Version objects in _versions and never reach a persistent data storeabort-is-status-change-not-disk-rollback — Aborting a transaction in both MVCC and SSI implementations sets a status flag; no disk writes are reversedbtree-wal-is-redo-only — The B-tree WAL's recover method replays logged page writes forward and truncates; it has no mechanism to reverse a page write