Function: encoderecord in write-ahead-log/wal.py

Date: 2026-05-28

Time: 18:16

encoderecord — WAL Binary Record Encoder

Purpose

encoderecord serializes a single WAL record into a self-describing binary format suitable for appending to a log file on disk. It exists to produce a compact, integrity-checked byte sequence that readrecord can later parse back — enabling crash recovery by replaying the log.

Every write operation in the WAL (put, delete, commit, checkpoint) passes through this function before hitting disk.

Contract

Preconditions:

Postconditions:

Invariant: The first 4 bytes written to the file (by the caller, *not* this function) are the recordlength, which equals the total size of everything *after* that 4-byte length prefix. Wait — actually, looking more carefully: the recordlength is packed *inside* the header that this function returns, meaning the length field is part of the returned bytes, and the reader (readrecord) reads the length separately first, then reads record_length more bytes. This means the length value counts everything *after* the initial 4-byte length prefix.

Parameters

| Parameter | Type | Meaning |

|-----------|------|---------|

| seq_num | int | Monotonically increasing sequence number. Packed as unsigned 64-bit (Q). |

| optypebyte | int | Operation type as a raw byte value: 1=PUT, 2=DELETE, 3=COMMIT, 4=CHECKPOINT. Packed as unsigned byte (B). |

| key | bytes | The record key. Can be empty (e.g., for COMMIT/CHECKPOINT records). |

| value | bytes | The record value. Can be empty. |

Edge cases: Empty key and value are valid — checkpoint() and commit records pass b"" for both.

Return Value

A single bytes object containing the complete binary record. The caller writes this directly to the file descriptor. The caller is responsible for flushing/fsyncing afterward.

Algorithm

Step by step:

1. Compute CRC32 over optypebyte (1 byte) + key + value. The & 0xFFFFFFFF mask ensures the result is an unsigned 32-bit value (Python's zlib.crc32 can return signed values on some platforms).

2. Calculate recordlength — the total byte count of everything *after* the 4-byte length prefix that read_record reads first:

3. Pack the fixed-size header using little-endian format <IIQBi:

4. Concatenate header + key bytes + value-length (packed as <i) + value bytes.

The resulting wire format is:


[record_length:4][crc:4][seq_num:8][op_type:1][key_len:4] | [key:N][val_len:4][value:M]
├──────────────── header (21 bytes fixed) ────────────────┘  ├──── variable ────────────┘

Note that recordlength is written as part of the returned bytes but is *also* read separately by readrecord — the reader peeks the first 4 bytes, then reads recordlength more bytes. So the total on-disk size per record is 4 + record_length bytes, i.e. 25 + len(key) + len(value).

Side Effects

None. This is a pure function — no I/O, no mutation, no state changes. All disk writes happen in the callers (append, append_batch, checkpoint, truncate).

Error Handling

No explicit error handling. struct.pack will raise struct.error if values overflow their format (e.g., seq_num exceeds uint64 range, or key length exceeds int32 range). These would propagate to the caller.

Usage Patterns

Called from four places:

Callers always pass key.encode("utf-8") and value.encode("utf-8"), except for COMMIT and CHECKPOINT which pass b"".

Dependencies

Notable Assumptions

1. Key and value lengths fit in a signed 32-bit integer (i format) — max ~2 GB per field. Using signed i rather than unsigned I is an odd choice; negative lengths would be nonsensical but aren't guarded against.

2. The CRC only covers optype + key + value, not seqnum or recordlength. A bit-flip in seqnum would go undetected. This is a deliberate trade-off (or oversight) — readrecord verifies CRC against the same three fields.

3. Little-endian byte order (< prefix) is hardcoded. The WAL files are not portable across architectures with different endianness, though in practice this rarely matters.

4. recordlength includes itself — the value 21 + len(key) + len(value) counts the 4 bytes for CRC but not the 4 bytes for recordlength itself. However, readrecord reads the length prefix *first* (4 bytes), then reads recordlength additional bytes — so recordlength actually does *not* include itself, and the accounting is correct.