Function: truncate in log-structured-merge-tree/lsm.py

Date: 2026-05-29

Time: 08:03

WAL.truncate — Write-Ahead Log Reset

Purpose

Clears the WAL file so it contains zero bytes, then re-opens it in append mode for future writes. This is called after the memtable has been successfully flushed to an SSTable on disk — at that point the WAL's recovery data is no longer needed, because the data it protected is now durable in the SSTable.

Without truncation, the WAL would replay stale entries on the next startup, re-inserting data that's already persisted in SSTables.

Contract

Parameters

None (operates on instance state).

Return Value

None.

Algorithm

1. Close the current handleself._fd.close(). Flushes any OS-buffered data and releases the file descriptor.

2. Open in write-binary mode ("wb") — This is the key step. Opening with "wb" truncates the file to zero length per POSIX/Python semantics. The file now exists but is empty.

3. Close the write handle — The truncation is committed to the filesystem.

4. Re-open in append-binary mode ("ab") — Restores the handle to append mode so subsequent append() calls work correctly. Append mode guarantees all writes go to the end of file, which is critical for WAL correctness.

The two-open dance (wb then ab) exists because Python's open("wb") truncates on open but positions the cursor at offset 0 for overwrite — not safe for a WAL. The second open in "ab" ensures the invariant that all future writes append.

Side Effects

Error Handling

None. If any open() or close() call raises an OSError (disk full, permission denied, file deleted), the exception propagates uncaught. This can leave the object in a partially broken state — e.g., if the second open fails, self._fd holds a closed handle from step 3, and subsequent append() calls will raise ValueError: I/O operation on closed file.

Usage Patterns

Called exactly once per flush cycle, from LSMTree._flush():


def _flush(self):
    # ... write memtable to SSTable ...
    self._wal.truncate()    # SSTable is durable; WAL data is now redundant

The caller must ensure the SSTable write completed successfully before calling truncate. If truncate runs before the SSTable is durable, a crash would lose data — the WAL is gone and the SSTable never landed. The current code does this correctly: SSTable.write() completes (including closing the file) before truncate() is called.

Dependencies

Assumptions Not Enforced by Types

Beliefs