Date: 2026-05-28
Time: 19:11
b-tree-storage-engine/testertestbtree.pyThis is the standalone test harness for the B-Tree storage engine implementation. It validates the BTree class against the behavioral contract described in DDIA Chapter 3 — a disk-backed, page-oriented sorted index supporting insert, lookup, update, delete, range scans, and crash recovery.
The naming convention testertest*.py appears throughout the repo (every project directory has one). These are the "tester" variants — self-contained test scripts that run via python testertestbtree.py without pytest, printing PASSED/FAILED to stdout. The sibling test_btree.py likely uses pytest conventions. This dual-harness pattern lets the project validate implementations both in CI (pytest) and interactively (direct execution).
Eight test functions, each targeting a distinct property of the B-Tree:
| Function | What it validates |
|---|---|
| testbasicput_get | CRUD lifecycle: insert, lookup, update, delete, membership (in), min/max, len() |
| testrangescan | Bounded range (k05..k10), unbounded range (k18..end), full sorted iteration |
| test_persistence | Close-and-reopen preserves data, updates, and deletes (crash recovery) |
| testlargedataset | 1000-key correctness — all lookups match, iteration is sorted, tree is balanced |
| testpageio_counts | Lookup cost is O(height) — reads at most height + 1 pages per get() |
| testvaluetoo_large | Rejects key-value pairs that exceed page size with ValueError |
| testemptytree | All operations are safe on an empty tree (no crashes, sane defaults) |
| testdeleteall_keys | Deleting every key returns the tree to an empty-but-usable state |
Temp-directory isolation. Every test uses tempfile.TemporaryDirectory() as a context manager, giving the B-Tree a fresh directory for its page files. This means tests are fully independent — no shared state, no cleanup logic, safe to run in any order.
Small branching factor. All tests use maxkeysper_page=4, which forces splits early. With a realistic branching factor (hundreds of keys per page), you'd need millions of entries to exercise multi-level trees. With 4, a 5th insert triggers the first split, making structural behavior (height growth, page splits, rebalancing) observable at small scale.
Print-based reporting. Each function prints its own PASSED message. The _main_ block runs all tests sequentially and prints a final summary. No test framework dependency — if any assertion fails, the script crashes with a traceback at that point.
Contract-style assertions. Tests assert on the public API contract (get, put, delete, rangescan, minkey, maxkey, len, iter, contains) and on observable internals (tree.stats.height, tree.pm.pagesread). The stats/page-manager exposure is deliberate — it lets tests verify performance properties, not just correctness.
Imports:
tempfile — stdlib, for isolated test directoriesbtree.BTree — the implementation under testImported by: Nothing. This is a leaf node — a test runner entry point.
Implicit dependency on BTree API surface:
BTree(path, maxkeysperpage=, pagesize=) constructor.put(key, value), .get(key), .delete(key) — CRUD.range_scan(start, end=None) — returns list of (key, value) tuples.minkey(), .maxkey() — boundary accessors.close() — flush and release resources.stats — object with .height, .total_pages.pm — page manager with .pagesread, .resetcounters()_len, iter, contains_ — dunder protocol supportWhen run as python testertestbtree.py:
1. Each test function is called sequentially in definition order.
2. Each creates a fresh BTree in a temp directory, exercises the API, asserts invariants, calls .close(), and prints PASSED.
3. If any assertion fails, Python raises AssertionError with a diagnostic message and the script halts — no further tests run.
4. If all eight pass, the final "All tests PASSED" line prints.
There's no setup/teardown framework — the with tempfile.TemporaryDirectory() block handles both. The test_persistence function is the only one that exercises the close-reopen cycle by creating a second BTree instance on the same directory.
testrangescan and testlargedataset both assert allkeys == sorted(allkeys) — the B-Tree must maintain key ordering across all leaves.testbasicput_get inserts "apple" twice and asserts len(tree) == 10 — put on an existing key is an update, not an insert.maxkeysper_page=4, height must be 2.testpageiocounts asserts pagesread <= height + 1 after a single get().max_keys=4, height must be >= 2 (in practice it will be ~5-6 for a balanced tree with branching factor 5).None, 0, False, or [] — never raise.Minimal, by design:
testvaluetoo_large is the only test that validates error production — it expects BTree.put() to raise ValueError when a key-value pair won't fit in a single page. It uses a bare try/except ValueError with a manual assert False fallthrough instead of pytest's raises().f"height={tree.stats.height}", f"got {keys}") include diagnostic context to aid debugging.b-tree-storage-engine/btree.py — The implementation under test; understanding the page split algorithm, page manager, and stats tracking will contextualize every assertion in this fileb-tree-storage-engine/btree.py:range_scan — How leaf-level sibling pointers enable efficient range queries across page boundariesb-tree-storage-engine/test_btree.py — The pytest-based sibling test suite; compare coverage and see if it tests additional edge cases (concurrent access, corruption)tester-test-pattern — The testertest*.py convention appears in ~25 project directories; understanding its role vs. test_*.py clarifies the project's testing strategyb-tree-storage-engine/fix-plan.md — Suggests there were known issues with this implementation; reading it reveals what broke and how it was fixedbtree-max-keys-forces-splits — Setting maxkeysper_page=4 causes the first page split after the 5th insert, raising tree height to 2btree-put-is-upsert — BTree.put() on an existing key updates the value in-place without increasing len(tree)btree-get-reads-at-most-height-plus-one-pages — A single get() reads at most height + 1 pages from disk, verified via pm.pages_readbtree-range-scan-excludes-end — range_scan("k05", "k10") returns keys k05 through k09 — the end bound is exclusivebtree-survives-close-reopen — Data written before close() is fully recoverable by constructing a new BTree on the same directory