Original source: topic-wal-checkpoint-protocol

Topic: DDIA's discussion of how checkpoints bound recovery time and enable log truncation (Chapter 3, SSTables and LSM-Trees)

Date: 2026-05-29

Time: 11:17

No problem — the observations have enough to build a complete explanation. Here's the walkthrough.

Topics to Explore

[function] log-structured-merge-tree/lsm.py:_flush — Trace how frozen memtable becomes SSTable and how WAL truncation is ordered relative to the SSTable fsync (write ordering is critical for crash safety)
[function] write-ahead-log/wal.py:truncate — The partial truncation logic that rewrites WAL files in place — study how it handles the current file vs. older rotated files
[function] write-ahead-log/wal.py:replay — How replay interacts with OP_COMMIT records to ensure only committed batches are applied (partial batches without COMMIT should be discarded)
[general] checkpoint-frequency-tradeoffs — DDIA discusses how checkpoint interval trades write amplification against recovery time; experiment with different memtable_threshold values and measure recovery duration
[file] write-ahead-log/test_wal.py — The truncation test (line 60) and crash recovery test (line 29) demonstrate the checkpoint-then-truncate protocol in isolation

lsm-wal-truncation-is-total — The LSM tree's WAL truncation (lsm.py:56) erases the entire log file rather than truncating to a sequence number, because every flush checkpoints all in-flight data
wal-checkpoint-is-explicit-record-type — The standalone WAL module uses a dedicated OP_CHECKPOINT opcode (value 4) written as a real record in the log stream, not out-of-band metadata
memtable-threshold-bounds-recovery — The memtable_threshold parameter (lsm.py:202) directly caps the maximum number of WAL entries that must be replayed on crash recovery, because the WAL is truncated on every flush
wal-truncate-preserves-records-above-seq — WriteAheadLog.truncate(uptoseq) (wal.py:179) keeps records with seqnum > upto_seq and deletes only those at or below, enabling partial log reclamation tied to checkpoint boundaries