Date: 2026-05-29
Time: 06:39
WriteAheadLog.truncate_beforeThis is a log compaction method. It discards WAL entries older than a given LSN (Log Sequence Number), reclaiming memory once those entries are no longer needed — typically after all consumers (derived systems) have processed past that point. This is the WAL equivalent of a checkpoint: once you know every downstream system has consumed through LSN *n*, entries before *n* are dead weight.
lsn should be a valid LSN within the log's range (though no enforcement exists — see below).self._entries contains only entries with lsn >= lsn. The in-memory list is replaced, not mutated in place.nextlsn is unaffected — the LSN counter never rewinds.| Parameter | Type | Meaning |
|-----------|------|---------|
| lsn | int | The cutoff. Entries with lsn strictly less than this value are discarded. Entries equal to or greater are kept. |
Edge cases:
lsn = 0 or lsn <= earliest_lsn: No entries are removed; returns 0.lsn > latestlsn: All entries are removed. The WAL becomes empty, but next_lsn still points past the last assigned LSN, so future appends continue with correct numbering.Returns an int — the count of entries removed. The caller can use this for logging/metrics but has no obligation to act on it.
1. Snapshot the current entry count (before).
2. Rebuild _entries via list comprehension, keeping only entries where e.lsn >= lsn.
3. Return the difference: before - len(self._entries).
This is O(n) in the number of entries — it scans the entire list. Since entries are ordered by LSN (they're appended sequentially), a bisect + slice would be O(log n + k) where k is the number retained, but the simplicity here is appropriate for a reference implementation.
self._entries: The old list is replaced (not mutated), so any external reference to the old list object would still see the untruncated data. In practice, nothing holds such a reference.persist_path, the on-disk file still contains the truncated entries. On next load, those entries will reappear. This is a memory-only truncation — a significant assumption that callers should be aware of.None. No exceptions are raised. Passing a negative LSN or a non-integer would either silently do nothing (negative) or raise a TypeError from the comparison operator — but neither case is guarded.
Typical usage is checkpoint-driven truncation:
# Find the lowest position across all consumers
min_position = min(sys.position for sys in derived_systems)
# Safe to discard everything consumers have already processed
removed = wal.truncate_before(min_position)
The caller is responsible for ensuring no consumer still needs entries below the cutoff. Truncating too aggressively means a consumer that falls behind cannot catch up — it would need a full rebuild from the storage engine instead of incremental replay.
self._entries (a list[WALEntry]) and the lsn attribute on each entry.append() guarantees.