Date: 2026-05-28
Time: 18:52
WAL.commit — Finalize a WAL transaction by clearing the logcommit marks the end of a write-ahead log transaction. After all dirty pages have been flushed to the data file, this method clears the WAL, signaling that the writes are durable and no recovery is needed on restart. It implements the "checkpoint" step of WAL-based crash safety: once the data file is known-good, the log entries that got it there are no longer needed.
log_write have already been applied to the PageManager (the data file contains the correct page images). The caller is responsible for ensuring logical consistency of the written pages before calling commit.fsync'd — a crash after commit returns will not trigger any replay. The sequence counter is reset to 0, ready for the next batch of writes.log_write calls and commit, the WAL contains a recoverable record of all in-flight page mutations. After commit, the WAL contains nothing.| Parameter | Type | Description |
|-----------|------|-------------|
| page_manager | PageManager | The page I/O layer whose data file must be synced before the WAL is cleared. |
None. This is a side-effect-only method.
1. pagemanager.sync() — Flush the data file's OS buffer and fsync it. This guarantees every page written via PageManager.writepage is durable on disk, not just in kernel buffers. This step *must* happen before the WAL is cleared — otherwise a crash could leave both the data file incomplete and the WAL empty, causing data loss.
2. self.f.seek(0) + self.f.truncate(0) — Erase the WAL file contents entirely. seek(0) positions the file cursor at the start; truncate(0) discards all bytes. After this, the WAL is logically empty.
3. self.f.flush() + os.fsync(self.f.fileno()) — Force the truncated (empty) WAL state to durable storage. Without this, the OS could still have the old WAL content cached, and a crash could "resurrect" stale log entries that would be replayed against already-committed data — corrupting it via double-application.
4. self._seq = 0 — Reset the monotonic sequence counter so the next transaction's log entries start from 1.
page_manager.sync() triggers flush + fsync on btree.dat.btree.wal) is truncated to zero bytes and that truncation is fsync'd.self._seq is reset.No exceptions are caught. If sync(), truncate(), or fsync() fails (e.g., disk full, I/O error), the exception propagates to the caller. This is the correct behavior — a failed commit should not silently succeed, since the WAL may still be needed for recovery.
commit is called at the end of every mutating B-tree operation (put, delete, close):
# In BTree.put, after all page writes are done:
self.wal.commit(self.pm)
The pattern is always: (1) log page writes via walwritepage / walwritemeta, (2) call commit to finalize. The caller must not interleave unrelated writes between log_write and commit, since commit clears *all* log entries.
PageManager.sync() — relied on to make the data file durable before the WAL is cleared.os.fsync — POSIX system call for forcing kernel buffers to physical storage.self.f — opened in init_, must remain valid.The ordering guarantee — data file fsync before WAL truncation fsync — is what makes crash recovery correct. The code assumes that os.fsync provides the barrier semantics described by POSIX: after fsync returns, the data is on stable storage. On some hardware (e.g., drives with volatile write caches that lie about flush completion), this assumption can be violated. The code does not use O_DIRECT or fdatasync; it trusts the OS and hardware to honor fsync.
wal-commit-sync-before-truncate — WAL.commit always fsyncs the data file before truncating the WAL; reversing this order would create a crash-safety hole where committed data could be lostwal-commit-clears-all-entries — commit clears the entire WAL unconditionally; there is no partial commit or transaction grouping within a single WAL filewal-seq-reset-on-commit — The WAL sequence counter resets to 0 on every commit, meaning sequence numbers are only meaningful within a single uncommitted transaction windowwal-is-redo-only — This WAL uses redo-only recovery (replay logged pages forward); there is no undo log, so a crash mid-transaction before commit means the incomplete writes in the data file may be partially applied but the WAL replay will overwrite them to a consistent state