Date: 2026-05-29
Time: 10:54
Every leaf in this B+-tree carries a 4-byte nextsibling pointer (stored immediately after the 3-byte page header), forming a singly-linked list across all leaves in sorted key order. The rightmost leaf uses the sentinel NOSIBLING = 0xFFFFFFFF (btree.py:13). If this chain is ever broken — a leaf inserted without being linked, or a pointer left dangling — rangescan and iter_ silently lose data.
_insert, leaf case)When a leaf overflows during insertion (key count exceeds maxkeys), the split must thread the new page into the existing chain. The algorithm, documented in the insert entry and visible from the code's structure:
1. Split the keys at the midpoint: mid = len(keys) // 2. Left half stays in the original page; right half goes to a newly allocated page.
2. Right page inherits the old next_sib: The original leaf's existing sibling pointer passes to the new right page. This is critical — the right page takes the original leaf's position in the chain relative to everything to the right.
3. Left page points to the new right page: The original leaf's next_sib is rewritten to the new page number.
The two serializeleaf calls during a split look conceptually like:
# Right half gets the old forward pointer
right_data = _serialize_leaf(right_keys, right_values, next_sibling=old_next_sib)
# Left half now points to the new right page
left_data = _serialize_leaf(left_keys, left_values, next_sibling=new_page_num)
This is a linked-list insert-after operation: the new node splices in between the original node and its former successor.
Consider a chain [A] → [B] → [C] where leaf B overflows and splits into B (left) and B' (right):
Before: [A] → [B] → [C]
After: [A] → [B] → [B'] → [C]
next_sib = C)next_sib = B')If you got this backwards — if B' pointed to B instead of C, or B still pointed to C instead of B' — a range_scan would either loop or skip the new right half entirely.
rangescan (btree.py): Descends to the start key's leaf via findleaf, then walks nextsibling pointers horizontally. Cost is O(height + leaf pages in range). A broken link truncates results silently — there's no cycle detection (range_scan entry, line 63).
_iter (btree.py): Descends the left spine (always following children[0]) to reach the leftmost leaf, then walks the entire chain to NOSIBLING. A broken link in the middle means list(tree) silently drops everything after the break.
Both rely on the chain being complete (every leaf reachable) and acyclic (terminates at NO_SIBLING). Neither validates these properties at runtime.
delete handles the mirror operation: when an empty leaf is removed (btree.py, delete entry), it patches the previous sibling's nextsibling to point to the removed leaf's nextsibling, splicing the empty node out. This only happens at depth 2 for non-leftmost children — an empty leftmost leaf stays allocated, wasting a page but not breaking the chain.
The split writes two pages (left and right halves) via walwritepage, which logs to the WAL before writing to the data file. If the process crashes after writing the right page but before updating the left page's nextsibling, the chain would be broken: the new right page exists but nothing points to it. WAL replay should fix this (it replays all logged writes), but allocatepage modifies metadata through PageManager.writemeta outside the WAL boundary — so there's a window where the page is allocated but the chain update hasn't been logged.
From serializeleaf (btree.py:182-191):
[page_type: 1B] [num_keys: 2B] [next_sibling: 4B] [entries...]
The nextsibling field is at a fixed offset (bytes 3-6), so deserialize_leaf can always find it regardless of how many entries the page holds. This is what makes the chain walkable without parsing the entire page.
btree-split-inherits-sibling — During a leaf split, the new right page inherits the original leaf's nextsibling pointer, and the original leaf's nextsibling is rewritten to point to the new right page — a linked-list insert-after operationbtree-sibling-chain-fixed-offset — The next_sibling field sits at a fixed byte offset (3-6) in every leaf page, making the chain walkable without fully parsing page contentsbtree-chain-no-cycle-detection — Neither rangescan nor iter detects cycles in the sibling chain; a corrupted nextsibling pointer causes an infinite loopbtree-delete-patches-prev-sibling — When delete removes an empty leaf, it reads the previous sibling and rewrites its nextsibling to point past the removed leaf, maintaining chain continuitybtree-split-wal-gap — Page allocation during splits modifies metadata outside the WAL, creating a window where a crash could leave the new page allocated but not linked into the sibling chain