Date: 2026-05-29
Time: 07:43
findleaf — Leaf Page Locatorfindleaf traverses the B-tree from a given internal node down to the leaf level, returning the page number of the leaf page that would contain the given key. It does not check whether the key actually exists — it finds where the key *would* live.
This exists to support rangescan, which needs to start iterating from a specific leaf page and then follow sibling pointers. Unlike search (which reads the leaf and returns a value), findleaf stops one step earlier: it returns the leaf's page number so the caller can manage the leaf-level scan itself.
Preconditions:
page_num must be a valid allocated page number pointing to a node at the given depth in the tree.depth >= 1. At depth 1, the page is a leaf. At depth > 1, the page is an internal node.d have children at depth d-1.key must be comparable with the keys stored in internal nodes (string-to-string ordering via bisect_right).Postcondition: Returns the page number of the leaf page whose key range includes key — specifically, the leaf reachable by following the same child-pointer path that _search would take.
Invariant: Each recursive call decrements depth by exactly 1, guaranteeing termination in depth - 1 recursive steps.
| Parameter | Type | Description |
|-----------|------|-------------|
| page_num | int | Page number of the current node being examined. On the initial call, this is the root page. |
| key | str | The search key. Used to choose which child pointer to follow at each internal level. |
| depth | int | Remaining depth from pagenum to the leaf level. 1 means pagenum is itself a leaf. |
Edge cases:
depth == 1 on a single-level tree (root is a leaf): returns page_num immediately with no I/O.key less than all internal keys: bisect_right returns 0, following children[0] (leftmost child).key greater than all internal keys: bisect_right returns len(ikeys), following the rightmost child.Returns an int — the page number of the leaf that would contain key. The caller (range_scan) then reads this page directly and begins scanning.
There is no "not found" return. The method always succeeds if the tree is well-formed.
1. Base case (depth == 1): The current page is a leaf. Return its page number immediately.
2. Recursive case (depth > 1):
pagenum from disk via self.pm.readpage.ikeys) and child page pointers (children). An internal node with *n* keys has *n+1* children.bisectright(ikeys, key) to find the insertion point. bisectright returns the index of the first key strictly greater than key, which is exactly the index of the correct child pointer. This mirrors the B-tree search invariant: child at index i covers keys in the range [ikeys[i-1], ikeys[i]).children[idx] with depth - 1.The choice of bisectright (not bisectleft) matters: for a key equal to a separator, bisectright directs to the right child, which is consistent with how insert and search route — the separator key belongs to the right subtree (visible in the leaf split logic where midkey = right_keys[0]).
depth - 1 reads total). These reads increment self.pm.pages_read.None. The method assumes:
page_num is valid (no bounds checking on the file).pagenum is actually an internal node when depth > 1 (no type check — deserialize_internal will silently misparse a leaf page).depth value could cause it to stop too early or read past leaves).A corrupted page or incorrect depth would produce silently wrong results rather than raising an exception.
Called exclusively by range_scan:
leaf_num = self._find_leaf(root, start_key, height)
The caller then reads that leaf, scans its keys, and follows next_sib pointers to walk through subsequent leaves. The caller is responsible for:
self.readmeta() to get the current root and height.startkey (since find_leaf returns the *containing* leaf, not a precise key match).self.pm.readpage(pagenum) — PageManager method for fixed-size page reads from the data file.deserializeinternal(data) — Module-level function that parses the binary page format into (keys, children).bisect_right from Python's bisect module — binary search for the insertion point in sorted keys.