Function: append in write-ahead-log/wal.py

Date: 2026-05-29

Time: 08:21

WriteAheadLog.append

Purpose

append writes a single operation record (PUT, DELETE, etc.) to the write-ahead log on disk. It exists to ensure durability before a mutation is applied to the main data structure — if the process crashes after append returns, the operation can be recovered from the log on restart.

Contract

Preconditions:

Postconditions:

Invariants:

Parameters

| Parameter | Type | Description |

|-----------|------|-------------|

| optype | str | Operation name — "PUT", "DELETE", "COMMIT", or "CHECKPOINT". Looked up in OPBYTES to get the integer tag for binary encoding. In practice callers use "PUT" or "DELETE" here; "COMMIT" and "CHECKPOINT" have dedicated methods. |

| key | str | The key being operated on. Encoded to UTF-8 bytes for the binary record. Cannot be None. |

| value | str | The value for PUT operations. Defaults to "" for DELETE where no value is meaningful. Encoded to UTF-8. |

Return Value

Returns int — the sequence number assigned to this record. Callers use this for:

Algorithm

1. Acquire lock — serializes all concurrent appends so sequence numbers are gap-free.

2. Increment and capture sequence numberself.seqnum += 1 then snapshot to local seq. This two-step pattern avoids the value changing if another method were called (it can't under the lock, but the pattern is defensive).

3. Encode the recordencoderecord produces a binary blob: a 4-byte length prefix, CRC32 checksum, 8-byte sequence number, 1-byte op type, length-prefixed key, and length-prefixed value.

4. Write to file — a single self._fd.write(data) call. Because the file is opened in "ab" (append-binary) mode, the OS guarantees the write position is always at end-of-file, even across concurrent processes (though this implementation also serializes via the lock).

5. Sync to diskdosync() either fsyncs immediately (sync mode) or batches fsyncs (batch mode). In nosync mode (any other string), nothing happens — data sits in the kernel buffer.

6. Maybe rotate — if the current file has grown past maxfile_size (default 10 MB), close it and open a new numbered WAL file. This keeps individual files manageable for replay and truncation.

7. Return sequence number under the lock — the caller knows this sequence number is durable (in sync mode).

Side Effects

Error Handling

The method does no error handling. Failures propagate directly:

Critically, if the write succeeds but dosync() raises, the sequence number has already been incremented but the caller receives an exception. The record may or may not be recoverable depending on whether the OS flushed the buffer before the crash.

Usage Patterns


wal = WriteAheadLog("/tmp/wal", sync_mode="sync")

# Typical put
seq = wal.append("PUT", "user:42", '{"name": "Alice"}')

# Delete (no value needed)
seq = wal.append("DELETE", "user:42")

# After applying to main store, truncate old entries
wal.truncate(up_to_seq=seq)

Individual append calls are independent — they don't participate in transactions. For atomic multi-operation writes, use append_batch instead, which writes all records plus a COMMIT marker under a single fsync.

Dependencies

Beliefs