seqnum and recordlength from integrity checking — worth understanding whether this is an intentional design choice or a gapDate: 2026-05-28
Time: 18:49
seq_num CorruptionThe CRC in this WAL covers only the payload — optypebyte + key + value — and excludes two header fields: recordlength and seqnum. These are two different situations, and the answer is split: one exclusion is defensible, the other is a real gap.
In encoderecord (wal.py:30-31):
crc_data = struct.pack("B", op_type_byte) + key + value
crc = zlib.crc32(crc_data) & 0xFFFFFFFF
The verification in readrecord (wal.py:53-55) mirrors this exactly — it recomputes the CRC over the same three fields and rejects on mismatch.
record_length exclusion: defensiblerecordlength is a framing field — it's read first (wal.py:39-42) to determine how many bytes to consume for the rest of the record. If recordlength is corrupted:
None at line 44). Either way, the CRC over garbage bytes would fail.This is a standard pattern. The framing field is *implicitly* validated because any corruption to it causes the payload CRC to fail. Including it in the CRC would be belt-and-suspenders — not wrong, but not necessary for correctness.
seq_num exclusion: a real gapseqnum is not a framing field — it doesn't affect how the record is parsed. A bit flip in seqnum produces a record that parses correctly, passes CRC validation, and is silently accepted with a wrong sequence number.
This matters because seq_num drives three critical control paths:
1. Recovery ordering (recoverseqnum, wal.py:85-96): The WAL scans all records to find max(seqnum) and resumes numbering from there. A corrupted seq_num inflated to a huge value would create a gap in the sequence space.
2. Truncation (truncate, wal.py:196): Records are kept or discarded based on rec.seqnum > uptoseq. A corrupted seqnum could cause a record to survive truncation (if inflated) or be incorrectly discarded (if deflated).
3. Replay filtering (replay, wal.py:226): rec.seqnum <= afterseq determines which records to skip. A corrupted sequence number could cause committed data to be skipped during crash recovery.
The same payload-only CRC pattern appears in the Bitcask implementation (bitcask.py:142-143), where CRC covers key + value but not keysize or valuesize. The B-tree WAL (btree.py:133,175-176) checksums pagedata but not seq, pagenum, or datalen. This is consistent — all three implementations treat metadata as outside the CRC boundary — but the WAL's seqnum has the most dangerous failure mode because it controls replay and truncation correctness.
The recordlength exclusion is an intentional design choice (standard framing pattern). The seqnum exclusion is a gap — including seq_num in the CRC input would cost nothing and would protect against silent ordering corruption during recovery. The fix would be a one-line change to line 30:
crc_data = struct.pack("<QB", seq_num, op_type_byte) + key + value
with the corresponding change in readrecord at line 53.
write-ahead-log/wal.py:readrecord — Trace how a corrupted record_length propagates: does every corruption path actually land on a CRC failure, or are there edge cases where it silently misframes?write-ahead-log/wal.py:replay — Understand the committed-transaction replay logic and how seq_num ordering interacts with the pending buffer to determine atomicitycrc-scope-comparison-across-implementations — Compare the CRC boundary choices in bitcask, b-tree WAL, and this WAL to see if the payload-only pattern is a deliberate project convention or independent coincidenceb-tree-storage-engine/btree.py — The B-tree WAL checksums pagedata only; examine whether its seq and pagenum fields are vulnerable to the same silent-corruption classwal-record-format-evolution — Consider what a v2 record format would look like: CRC over the full header, a magic byte for format versioning, and forward-compatible length-prefixed fieldswal-crc-covers-payload-only — The WAL CRC at wal.py:30-31 checksums optypebyte + key + value only; seqnum and recordlength are excluded from integrity checkingseq-num-corruption-is-silent — A bit flip in a WAL record's seq_num field will pass CRC validation and be accepted with the wrong sequence number, potentially corrupting replay, truncation, and recovery orderingrecord-length-implicitly-validated — Corruption of recordlength causes read_record to consume wrong bytes, which fails the payload CRC check indirectly, making explicit inclusion in the CRC unnecessaryall-three-implementations-use-payload-only-crc — The WAL, Bitcask, and B-tree storage engine all compute CRC over data payloads only, excluding their respective header/framing metadata fields