Original source: topic-batch-atomicity-across-rotation

Topic: Explore what happens if `appendbatch` writes a buffer that pushes past `max``file_size` — does the batch stay atomic or can it split?

Date: 2026-05-29

Time: 08:09

Batch Atomicity Across File Rotation in the WAL

The batch stays atomic. It can never split across files.

The key is the ordering of operations inside append_batch at write-ahead-log/wal.py:153:


def append_batch(self, operations: List[Tuple[str, str, str]]) -> int:
    with self._lock:
        buf = bytearray()
        for op_type, key, value in operations:
            self._seq_num += 1
            buf.extend(_encode_record(...))
        self._seq_num += 1
        commit_seq = self._seq_num
        buf.extend(_encode_record(commit_seq, OP_COMMIT, b"", b""))
        self._fd.write(bytes(buf))          # 1. write entire batch
        self._do_sync(force=True)           # 2. fsync
        self._maybe_rotate()               # 3. THEN check rotation
        return commit_seq

Three things guarantee atomicity:

1. Single buffer, single write

All operation records plus the COMMIT marker are assembled into one bytearray, then flushed with a single self._fd.write(bytes(buf)). There's no per-record rotation check inside the loop. The buffer can be arbitrarily large — it doesn't matter.

2. Rotation is post-write

mayberotate() (line 136) runs after the write and sync complete:


def _maybe_rotate(self):
    if self._fd and self._fd.tell() >= self._max_file_size:
        self._rotate()

This means a WAL file can grow beyond maxfilesize if a batch pushes it past the limit. The size cap is soft, not hard. The file only rotates on the *next* operation after the batch finishes.

3. The lock prevents interleaving

The entire method runs under self._lock, so no other thread can sneak a rotation or a competing write between the batch records.

Contrast with single-record `append`

The single-record append at line 141 follows the same write-then-rotate pattern, but for individual records it's less consequential — each record is self-contained. The batch case is where this ordering decision actually matters for correctness.

Contrast with Bitcask

Bitcask (hash-index-storage/bitcask.py:169) calls mayberotate() before each put, which means consecutive puts can land in different files. That's fine for Bitcask since it doesn't need batch semantics — each key-value pair is independently addressable via the in-memory keydir.

What could go wrong

If the process crashes *during* self.fd.write(bytes(buf)) — after some bytes hit disk but before the COMMIT record is written — recovery code should treat the batch as incomplete. The COMMIT record (OPCOMMIT) at the end of the buffer serves as the durability marker: if it's absent, the batch is discarded during replay. This is the standard WAL pattern from DDIA Chapter 3.

One subtle implication

Because openlatest (line 100) checks os.path.getsize(last) < self.maxfile_size on startup, a file that was pushed past the limit by a batch will trigger a new file on the next open. The overshoot is tolerated for exactly one batch's worth.

Topic: Explore what happens if appendbatch writes a buffer that pushes past maxfile_size — does the batch stay atomic or can it split?