Date: 2026-05-28
Time: 18:16
encoderecord — WAL Binary Record Encoderencoderecord serializes a single WAL record into a self-describing binary format suitable for appending to a log file on disk. It exists to produce a compact, integrity-checked byte sequence that readrecord can later parse back — enabling crash recovery by replaying the log.
Every write operation in the WAL (put, delete, commit, checkpoint) passes through this function before hitting disk.
Preconditions:
key and value must already be encoded as bytes (callers handle UTF-8 encoding)optypebyte must be a valid operation constant (1–4), though this is not enforced hereseq_num must fit in an unsigned 64-bit integer (≤ 2^64 - 1)Postconditions:
readrecord can round-trip back to a WALRecordoptypebyte + key + value, so any corruption of those fields is detectable on readInvariant: The first 4 bytes written to the file (by the caller, *not* this function) are the recordlength, which equals the total size of everything *after* that 4-byte length prefix. Wait — actually, looking more carefully: the recordlength is packed *inside* the header that this function returns, meaning the length field is part of the returned bytes, and the reader (readrecord) reads the length separately first, then reads record_length more bytes. This means the length value counts everything *after* the initial 4-byte length prefix.
| Parameter | Type | Meaning |
|-----------|------|---------|
| seq_num | int | Monotonically increasing sequence number. Packed as unsigned 64-bit (Q). |
| optypebyte | int | Operation type as a raw byte value: 1=PUT, 2=DELETE, 3=COMMIT, 4=CHECKPOINT. Packed as unsigned byte (B). |
| key | bytes | The record key. Can be empty (e.g., for COMMIT/CHECKPOINT records). |
| value | bytes | The record value. Can be empty. |
Edge cases: Empty key and value are valid — checkpoint() and commit records pass b"" for both.
A single bytes object containing the complete binary record. The caller writes this directly to the file descriptor. The caller is responsible for flushing/fsyncing afterward.
Step by step:
1. Compute CRC32 over optypebyte (1 byte) + key + value. The & 0xFFFFFFFF mask ensures the result is an unsigned 32-bit value (Python's zlib.crc32 can return signed values on some platforms).
2. Calculate recordlength — the total byte count of everything *after* the 4-byte length prefix that read_record reads first:
4 (CRC) + 8 (seqnum) + 1 (optype) + 4 (keylen) + len(key) + 4 (vallen) + len(value) = 21 + len(key) + len(value)3. Pack the fixed-size header using little-endian format <IIQBi:
I — record_length (uint32)I — crc (uint32)Q — seq_num (uint64)B — optypebyte (uint8)i — len(key) (int32, signed)4. Concatenate header + key bytes + value-length (packed as <i) + value bytes.
The resulting wire format is:
[record_length:4][crc:4][seq_num:8][op_type:1][key_len:4] | [key:N][val_len:4][value:M]
├──────────────── header (21 bytes fixed) ────────────────┘ ├──── variable ────────────┘
Note that recordlength is written as part of the returned bytes but is *also* read separately by readrecord — the reader peeks the first 4 bytes, then reads recordlength more bytes. So the total on-disk size per record is 4 + record_length bytes, i.e. 25 + len(key) + len(value).
None. This is a pure function — no I/O, no mutation, no state changes. All disk writes happen in the callers (append, append_batch, checkpoint, truncate).
No explicit error handling. struct.pack will raise struct.error if values overflow their format (e.g., seq_num exceeds uint64 range, or key length exceeds int32 range). These would propagate to the caller.
Called from four places:
append() — single record writes (PUT/DELETE)append_batch() — multiple records + a COMMIT record, buffered into a bytearray before a single write()checkpoint() — writes a checkpoint marker with empty key/valuetruncate() — re-encodes kept records when rewriting WAL filesCallers always pass key.encode("utf-8") and value.encode("utf-8"), except for COMMIT and CHECKPOINT which pass b"".
struct (stdlib) — binary packing with format stringszlib (stdlib) — CRC32 checksum for integrity verification1. Key and value lengths fit in a signed 32-bit integer (i format) — max ~2 GB per field. Using signed i rather than unsigned I is an odd choice; negative lengths would be nonsensical but aren't guarded against.
2. The CRC only covers optype + key + value, not seqnum or recordlength. A bit-flip in seqnum would go undetected. This is a deliberate trade-off (or oversight) — readrecord verifies CRC against the same three fields.
3. Little-endian byte order (< prefix) is hardcoded. The WAL files are not portable across architectures with different endianness, though in practice this rarely matters.
4. recordlength includes itself — the value 21 + len(key) + len(value) counts the 4 bytes for CRC but not the 4 bytes for recordlength itself. However, readrecord reads the length prefix *first* (4 bytes), then reads recordlength additional bytes — so recordlength actually does *not* include itself, and the accounting is correct.