Original source: topic-wal-batch-atomicity-gap

Topic: The gap between the docstring's claim ("skips uncommitted batches") and the implementation (returns all PUT/DELETE regardless of COMMIT presence)

Date: 2026-05-29

Time: 07:55

I now have a thorough understanding of the gap. Here's the explanation:

Topics to Explore

[function] write-ahead-log/wal.py:append_batch — How the single-write trick provides probabilistic atomicity for small batches, and where it breaks down for large ones
[function] write-ahead-log/wal.py:readall_records — The corruption-stopping iterator that replay delegates to; its early-termination behavior interacts with the atomicity gap (corruption mid-batch stops all subsequent records, but uncommitted batch records before the corruption point still get returned)
[function] write-ahead-log/wal.py:iterate — The raw record stream that preserves COMMIT markers, which a production-grade replay could use to implement proper batch boundary tracking
[file] log-structured-merge-tree/lsm.py — The LSM tree's replay method (line 28) consumes WAL replay during crash recovery; check whether it would be affected by partial batch replay
[general] commit-aware-replay-design — Designing a correct replay that groups records by batch using sequence number ranges between COMMIT markers, handling edge cases like interleaved individual writes and nested batches

replay-does-not-enforce-batch-atomicity — Despite the docstring claiming replay "skips uncommitted batches," the implementation returns all PUT/DELETE records regardless of whether a matching COMMIT record exists; a crash mid-batch will replay partial batch records
wal-docstring-describes-intent-not-behavior — The replay() docstring describes the conceptual DDIA behavior (committed-only replay), while the inline comments and implementation reflect the actual simplified behavior (all-data-ops replay)
iterate-enables-correct-replay — iterate() preserves COMMIT and CHECKPOINT markers in the record stream, providing the raw material needed to build a commit-aware replay externally
batch-atomicity-is-fsync-level-only — append_batch's atomicity guarantee operates at the fsync level (all-or-nothing disk write via single write() call), not at the replay-filtering level; the two guarantees are independent and the second is missing