File: b-tree-storage-engine/testertestbtree.py

Date: 2026-05-28

Time: 19:11

b-tree-storage-engine/testertestbtree.py

Purpose

This is the standalone test harness for the B-Tree storage engine implementation. It validates the BTree class against the behavioral contract described in DDIA Chapter 3 — a disk-backed, page-oriented sorted index supporting insert, lookup, update, delete, range scans, and crash recovery.

The naming convention testertest*.py appears throughout the repo (every project directory has one). These are the "tester" variants — self-contained test scripts that run via python testertestbtree.py without pytest, printing PASSED/FAILED to stdout. The sibling test_btree.py likely uses pytest conventions. This dual-harness pattern lets the project validate implementations both in CI (pytest) and interactively (direct execution).

Key Components

Eight test functions, each targeting a distinct property of the B-Tree:

| Function | What it validates |

|---|---|

| testbasicput_get | CRUD lifecycle: insert, lookup, update, delete, membership (in), min/max, len() |

| testrangescan | Bounded range (k05..k10), unbounded range (k18..end), full sorted iteration |

| test_persistence | Close-and-reopen preserves data, updates, and deletes (crash recovery) |

| testlargedataset | 1000-key correctness — all lookups match, iteration is sorted, tree is balanced |

| testpageio_counts | Lookup cost is O(height) — reads at most height + 1 pages per get() |

| testvaluetoo_large | Rejects key-value pairs that exceed page size with ValueError |

| testemptytree | All operations are safe on an empty tree (no crashes, sane defaults) |

| testdeleteall_keys | Deleting every key returns the tree to an empty-but-usable state |

Patterns

Temp-directory isolation. Every test uses tempfile.TemporaryDirectory() as a context manager, giving the B-Tree a fresh directory for its page files. This means tests are fully independent — no shared state, no cleanup logic, safe to run in any order.

Small branching factor. All tests use maxkeysper_page=4, which forces splits early. With a realistic branching factor (hundreds of keys per page), you'd need millions of entries to exercise multi-level trees. With 4, a 5th insert triggers the first split, making structural behavior (height growth, page splits, rebalancing) observable at small scale.

Print-based reporting. Each function prints its own PASSED message. The _main_ block runs all tests sequentially and prints a final summary. No test framework dependency — if any assertion fails, the script crashes with a traceback at that point.

Contract-style assertions. Tests assert on the public API contract (get, put, delete, rangescan, minkey, maxkey, len, iter, contains) and on observable internals (tree.stats.height, tree.pm.pagesread). The stats/page-manager exposure is deliberate — it lets tests verify performance properties, not just correctness.

Dependencies

Imports:

Imported by: Nothing. This is a leaf node — a test runner entry point.

Implicit dependency on BTree API surface:

Flow

When run as python testertestbtree.py:

1. Each test function is called sequentially in definition order.

2. Each creates a fresh BTree in a temp directory, exercises the API, asserts invariants, calls .close(), and prints PASSED.

3. If any assertion fails, Python raises AssertionError with a diagnostic message and the script halts — no further tests run.

4. If all eight pass, the final "All tests PASSED" line prints.

There's no setup/teardown framework — the with tempfile.TemporaryDirectory() block handles both. The test_persistence function is the only one that exercises the close-reopen cycle by creating a second BTree instance on the same directory.

Invariants

Error Handling

Minimal, by design:

Topics to Explore

Beliefs