Function: _iter_ in b-tree-storage-engine/btree.py

Date: 2026-05-29

Time: 07:43

BTree._iter_ — Full In-Order Traversal via Leaf Chain

1. Purpose

_iter_ makes BTree instances iterable, letting you write for key, val in tree: ... or list(tree). It yields every (key, value) pair stored in the tree in sorted key order. This is the full-table-scan equivalent for the B-tree — useful for serialization, debugging, or bulk reads where you need every record.

2. Contract

3. Parameters

None beyond self. This is a generator method invoked implicitly by Python's iteration protocol.

4. Return Value

A generator yielding (str, bytes) tuples — the key (decoded UTF-8 string) and value (raw bytes) for every entry. When the tree is empty, the generator yields nothing. The caller receives pairs in sorted order and can stop early (the generator is lazy).

5. Algorithm

Two phases:

Phase 1 — Descend to the leftmost leaf:


page_num = root
for each internal level (height - 1 times):
    read the internal page
    follow children[0]   ← always the leftmost child pointer

This walks the left spine of the tree. If height == 1, the root is already a leaf and this loop is skipped entirely.

Phase 2 — Scan the leaf chain:


while page_num != NO_SIBLING (0xFFFFFFFF):
    deserialize the leaf at page_num
    yield each (key, value) pair
    advance to next_sibling

Each leaf stores a nextsibling pointer set during splits (see insert). The chain terminates when nextsibling == NOSIBLING. This is a classic B+-tree leaf scan — no backtracking up the tree is needed.

6. Side Effects

7. Error Handling

None. The method assumes all pages are readable and well-formed. A corrupted page, a truncated file, or an invalid sibling pointer would raise a struct.unpack error or silently produce garbage. There's no validation of page types during traversal — it trusts that the left-spine pages are internal and the bottom level is leaves.

8. Usage Patterns


# Dump entire tree
for key, value in tree:
    print(key, value)

# Materialize to list
all_pairs = list(tree)

# Check if tree contains a value (linear scan — prefer tree.get() for single lookups)
found = any(v == target for _, v in tree)

Caller obligation: Do not mutate the tree during iteration. The generator holds a page_num cursor into the leaf chain; inserts that trigger splits could corrupt the chain mid-scan. There is no snapshot isolation or MVCC here.

9. Dependencies

| Dependency | Role |

|---|---|

| self.readmeta() | Gets root page number and tree height |

| self.pm.read_page(n) | Raw page I/O from PageManager |

| deserializeinternal(data) | Parses internal node to extract children[0] |

| deserializeleaf(data) | Parses leaf node to extract keys, values, and next sibling pointer |

| NO_SIBLING (0xFFFFFFFF) | Sentinel marking the end of the leaf chain |

Unenforceable Assumptions