File: b-tree-storage-engine/fix-plan.md

Date: 2026-05-28

Time: 19:18

b-tree-storage-engine/fix-plan.md

Purpose

This is a bug triage and fix plan for the B-tree storage engine implementation in btree.py. It's not code — it's a design document that catalogs six correctness and durability bugs, proposes fixes with specific line references, defines an execution order based on dependency analysis, and lists test gaps. Its role is to serve as a work queue: a developer picks up the next bug in sequence, applies the described fix, and adds the corresponding test.

The document owns the engineering reasoning behind the fixes — not just *what* to change, but *why* the current behavior is wrong and *in what order* fixes must land.

Key Components

The Six Bugs

Each bug entry follows a consistent structure: Problem (observable failure mode), Fix (concrete implementation change), Files (line-level pointers into btree.py).

| Bug | Severity | Category | Core Issue |

|-----|----------|----------|------------|

| Bug 1 | High | Durability | PageManager.write_page calls flush() but not os.fsync() — pages may not survive a crash |

| Bug 2 | Critical | Durability | WAL truncates before data file fsync — crash window where both WAL and data are lost |

| Bug 3 | Critical | Consistency | Metadata writes (root pointer, free list head) bypass the WAL entirely |

| Bug 4 | Low | Integrity | Checksum is a naive byte sum mod 2^32 — fails to detect byte reordering or multi-byte corruption |

| Bug 5 | Medium | Resource leak | _delete never frees empty leaf pages — free list exists but delete doesn't use it |

| Bug 6 | Low | Fragility | put() reads metadata multiple times across operations that mutate it |

Execution Order

The plan defines a dependency-driven execution order, not priority order:


Bug 4 (checksum) ──────────────────────┐
Bug 6 (metadata reads) ───────────────┐│
Bug 1 (data file fsync) ──► Bug 2 ────┤│  (WAL commit sequence)
Bug 3 (WAL metadata)   ──► Bug 2 ────┘│
Bug 5 (delete freeing) ───────────────┘

Bugs 1 and 3 are prerequisites for Bug 2 because the new commit sequence (fsync data file → truncate WAL → fsync WAL) requires both the data file fsync mechanism (Bug 1) and metadata-through-WAL logging (Bug 3) to exist first.

Tests to Add

Four test scenarios targeting the gaps exposed by these bugs. Notably, all four test *crash recovery* or *structural integrity* — not normal-path behavior, which the existing test_btree.py presumably already covers.

Patterns

Write-Ahead Logging (WAL) protocol. The entire document revolves around the WAL correctness invariants from DDIA Chapter 3. The fix for Bug 2 describes the canonical WAL commit sequence: ensure protected data is durable *before* discarding the log. The current code violates this by truncating the WAL before fsyncing the data file.

Idempotent recovery. Bug 2's fix relies on the insight that WAL replay is always safe because page writes are idempotent — writing the same page data twice produces the same result. This eliminates the need for a commit marker entirely.

Incremental correctness. The plan doesn't attempt a full rewrite. It fixes each bug in isolation while respecting dependencies. Bug 5 explicitly scopes out merge/redistribute (standard B-tree rebalancing) and only handles the empty-leaf case.

Dependencies

What it depends on:

What depends on it:

Flow

The document is designed to be executed top-to-bottom through the Execution Order section:

1. Bug 4 (swap _checksum to zlib.crc32) — a pure function replacement with no side effects on other components.

2. Bug 6 (consolidate metadata reads in put()) — a refactor that simplifies the metadata flow but doesn't change durability semantics.

3. Bug 1 (add os.fsync to PageManager.write_page, add sync() method) — introduces the sync() primitive.

4. Bug 3 (route writemeta through WAL) — ensures metadata changes are logged.

5. Bug 2 (rewrite WAL.commit() to: fsync data → truncate WAL → fsync WAL) — consumes the primitives from steps 3 and 4.

6. Bug 5 (free empty leaves on delete) — the most complex change, independent of the durability fixes.

Invariants

The plan implicitly defines several invariants that the fixed code must maintain:

1. WAL-before-data: All page writes (including metadata page 0) must be logged in the WAL before being written to the data file.

2. Fsync-before-truncate: The data file must be fsynced before the WAL is truncated — this is the core durability guarantee.

3. Idempotent replay: WAL recovery must be safe to run at any time, even if some entries were already applied to the data file.

4. Checksum integrity: CRC32 must detect single-bit, byte-reorder, and multi-byte corruption in WAL entries.

5. Page lifecycle: Allocated pages must eventually be freed when their contents are removed (no unbounded page leaks from deletes).

Error Handling

The document doesn't describe error handling directly, but several bugs *are* error-handling failures:

Topics to Explore

Beliefs