src/backend/access/nbtree/README to see the latch coupling protocol, incomplete-split handling, and page deletion dance documented in detailDate: 2026-05-29
Time: 10:55
I'll work from the observations and my knowledge of PostgreSQL's nbtree internals to explain how the production implementation compares to this codebase's B-tree.
lehman-yao-right-links — The original L&Y paper that introduced right-link pointers to eliminate parent-child latch coupling; the theoretical foundation for PostgreSQL's approachlehman-yao-high-keys — How high keys let readers detect concurrent splits without holding parent latches, and why this implementation's leaf-only sibling links aren't sufficientpostgres-nbtree-half-dead-state-machine — The multi-phase page deletion state machine that prevents concurrent scans from following dangling pointersb-tree-storage-engine/btree.py:_delete — The current deletion logic that frees pages without updating sibling links; understanding this gap clarifies why PostgreSQL's protocol existspostgres-nbtree-vacuum — How PostgreSQL's VACUUM interacts with the B-tree to reclaim dead pages, and why deferred reclamation is necessary for MVCC visibilitybtree-no-concurrent-access-control — The B-tree implementation has no latches, locks, or concurrency control; all operations assume single-threaded exclusive accessbtree-leaf-sibling-not-updated-on-delete — When delete frees an empty leaf page via freepage, it does not update the predecessor leaf's next_sibling pointer, leaving a dangling right-linkbtree-split-uses-recursive-return — Splits propagate upward via return values ((midkey, newpagenum) from insert_into), not via the cooperative detect-and-finish protocol used in L&Y treesbtree-wal-uses-full-page-images — The WAL logs complete page images with CRC checksums rather than logical operations, making incomplete-split states impossible but increasing write amplificationbtree-free-list-immediate-reclaim — Freed pages are immediately pushed onto the free list with no deferred-reclamation mechanism, which would be unsafe under concurrent access or MVCC