Topic: PostgreSQL's src/backend/access/nbtree/nbtpage.c contains btpagedel and btunlinkhalfdeadpage, the canonical implementation of this two-phase protocol

Date: 2026-05-29

Time: 10:46

Two-Phase B-Tree Page Deletion: What PostgreSQL Does vs. What This Codebase Does

The claim references PostgreSQL's canonical approach to safely removing pages from a B-tree: first mark a page as "half-dead" (severing it from its parent but keeping sibling links intact), then unlink it from the sibling chain in a second step. This two-phase protocol exists because a single atomic operation can't safely update the parent, left sibling, and right sibling simultaneously — a crash mid-operation could leave the tree with dangling pointers.

This codebase does not implement any form of two-phase page deletion. The observations confirm this comprehensively:

What the codebase does have

The infrastructure for page freeing exists but is disconnected from delete:

Why two phases matter (and why this implementation can skip it — for now)

PostgreSQL's btpagedel marks a page half-dead in one WAL-logged operation, then btunlinkhalfdeadpage removes it from the sibling chain in a separate WAL-logged operation. If a crash occurs between phases, recovery can see the half-dead marker and complete the unlink. This is necessary because PostgreSQL has concurrent readers traversing sibling chains — a reader following a next pointer must never land on a freed page.

This implementation is single-threaded with no concurrent readers, so the two-phase protocol isn't strictly necessary. But the absence of *any* page freeing on delete means the tree only grows — a significant practical limitation documented in Bug 5.

Topics to Explore

Beliefs