File: write-ahead-log/testertestwal.py

Date: 2026-05-28

Time: 18:47

write-ahead-log/testertestwal.py

Purpose

This is a standalone integration test suite for the Write-Ahead Log (WAL) implementation. The tester prefix distinguishes it from testwal.py — this file is designed to run directly via python testertestwal.py (note the _main_ block), not through pytest. It validates the WAL's durability guarantees: that writes survive crashes, corruption is detected, and sequence numbers remain monotonic across restarts. It's the behavioral contract for wal.py, encoding the invariants a correct WAL must satisfy.

Key Components

Each test function exercises one WAL guarantee:

| Function | Guarantee tested |

|----------|-----------------|

| testbasicappendandreplay | Single writes, batch writes, checkpoint, and replay all round-trip correctly. Sequence numbers are assigned contiguously starting at 1. |

| testcrashrecovery | Opening a new WriteAheadLog on the same directory without closing the first recovers all fsynced records. Sequence numbering resumes correctly. |

| testcorruptionstops_replay | Binary corruption in the WAL file causes replay to stop at the corruption boundary — earlier valid records are still returned. |

| testtruncation | truncate(n) removes all records with seqnum <= n. Only records after the truncation point survive. |

| testlogrotation | When maxfilesize is exceeded, the WAL creates multiple .wal files. Replay spans all segments transparently. |

| testiterateincludes_commit | iterate() yields raw records including internal COMMIT markers, unlike replay() which filters them. |

| testemptywal | A fresh WAL has sequence number 0 and replays nothing. |

| testlargevalues | 100KB values survive serialization and deserialization intact. |

| testcheckpointreplayafter | replay(afterseq=cp) returns only records written after the checkpoint sequence number. |

| testsequencemonotonicacrossrestart | After a clean close and reopen, the next sequence number continues from where the previous instance left off. |

Patterns

Temp-directory isolation. Every test uses tempfile.TemporaryDirectory() as a context manager, so WAL files are cleaned up automatically and tests can't interfere with each other.

Crash simulation by abandonment. testcrashrecovery simulates a crash by simply not calling wal.close() and opening a second WriteAheadLog on the same directory. This works because sync_mode="sync" forces fsync on each write — the data is durable on disk even without a graceful shutdown.

Direct binary corruption. testcorruptionstops_replay opens the .wal file in binary mode and overwrites the last 5 bytes with 0xFF. This is a targeted corruption that damages the final record while leaving the first record intact, testing the WAL's CRC/checksum validation.

Print-on-pass reporting. Each test prints its own pass message. The _main_ block runs all tests sequentially and prints ALL TESTS PASSED at the end. No test framework dependencies — this is a self-contained harness.

Dependencies

Imports:

Imported by: Nothing directly. Run as a script or potentially invoked by a CI harness that discovers tester_*.py files.

Flow

When run as _main_:

1. Each test creates an isolated temp directory

2. Instantiates WriteAheadLog(dir, sync_mode="sync") — always with synchronous fsync

3. Performs operations (append, batch, checkpoint, truncate)

4. Asserts invariants on returned sequence numbers and replayed records

5. Closes the WAL and lets the temp directory clean up

6. Prints pass/fail per test

The corruption test has an extra step: it closes the WAL, locates the .wal file via glob, writes garbage bytes at the end, then reopens the WAL and checks that replay degrades gracefully.

Invariants

These tests collectively enforce:

1. Contiguous sequence numberingappend returns 1, 2, 3, ... with no gaps. Batch operations consume one sequence number per operation in the batch.

2. Checkpoint consumes a sequence number — after 6 data records (3 individual + 3 batch), currentseqnum() is 7, and checkpoint() returns 8.

3. Crash durability — with sync_mode="sync", records survive process death without close().

4. Corruption isolation — corrupted data stops replay but does not discard valid preceding records.

5. Truncation is exclusivetruncate(2) removes records with seq <= 2, keeps seq 3+.

6. Monotonic sequences across restarts — sequence numbers never reset, even after close/reopen.

7. replay() filters COMMIT records — returns only data operations. iterate() includes everything.

8. replay(after_seq=n) is exclusive — returns records with seq > n.

Error Handling

The tests themselves don't handle errors — assertion failures propagate as AssertionError and halt execution (since there's no test runner to catch them). The tests verify error handling in the WAL:

Topics to Explore

Beliefs