Function: writerecord in hash-index-storage/bitcask.py

Date: 2026-05-29

Time: 08:15

writerecord — Append a key-value record to the active data file

Purpose

writerecord is the single write path for all mutations in this Bitcask store. Every put and delete flows through it. It serializes a key-value pair into a binary record format (header + key + value), appends it to the currently active data file, and returns the metadata the caller needs to update the in-memory index (keydir).

This method only handles the physical write. It does not update the keydir — that's the caller's responsibility, which is a deliberate separation of concerns. The method also doesn't handle file rotation; mayberotate is called by the caller before invoking this.

Contract

Preconditions:

Postconditions:

Invariants:

Parameters

| Parameter | Type | Description |

|-----------|------|-------------|

| key | str | The key to store. Encoded to UTF-8 for serialization. No length limit enforced — a key larger than maxfilesize would silently succeed. |

| value | str | The value to store. An empty string "" is the tombstone convention used by delete(). |

Return Value

Returns tuple[int, int, float]:

| Index | Name | Meaning |

|-------|------|---------|

| 0 | offset | Byte position where this record starts in the active file. Used as the disk pointer in keydir. |

| 1 | size | Total byte length of the record (header + key + value). Used to read the record back in readrecord. |

| 2 | ts | The time.time() timestamp captured at the start of the write. Used for conflict resolution during compaction (latest timestamp wins). |

The caller must use these to construct a KeyEntry and update keydir, or (for deletes) to pop the key from keydir.

Algorithm

1. Capture timestamptime.time() gives wall-clock seconds as a float. This happens first, before any I/O, so the timestamp reflects intent time, not write-completion time.

2. Encode key and value — Both are UTF-8 encoded to raw bytes. The byte lengths are needed for the header.

3. Pack the header — Uses struct.pack with format <dII (little-endian: 8-byte double for timestamp, two 4-byte unsigned ints for key/value sizes). This produces exactly 16 bytes (HEADER_SIZE).

4. Assemble the record — Simple concatenation: header + keybytes + valbytes. No CRC or checksum — the implementation trusts the filesystem.

5. Capture current offsetself.active_file.tell() gives the byte position where the record will land. Because the file is opened in append mode, this is always the end of the file.

6. Write — The entire record is written in one write() call, which on most OSes is atomic for reasonable sizes (below PIPE_BUF), but there's no explicit guarantee here for very large records.

7. Flushflush() pushes Python's userspace buffer to the OS. This is always done.

8. Optional fsync — If self.sync_writes is True (the default), os.fsync() forces the OS to write through to the physical disk. This is the durability guarantee — without it, a crash could lose recently written records that were still in the OS page cache.

9. Return metadata — The offset, total record size, and timestamp are returned for the caller to index.

Side Effects

Error Handling

There is no explicit error handling. The following can propagate to the caller:

None of these are caught — they bubble up through put() or delete() to the application. A partial write (crash mid-write) would leave a truncated record at the end of the file, which scandatafile would hit as a short read and silently stop scanning (the if len(headerdata) < HEADER_SIZE: break guard).

Usage Patterns

Called in exactly two places:

The caller always calls mayberotate() first to ensure the active file hasn't exceeded maxfilesize. This ordering matters — if rotation happens after the write, the record lands in an oversized file but still works.

Dependencies

| Dependency | Usage |

|------------|-------|

| time.time() | Monotonically-ish increasing wall clock for timestamps. Not monotonic — clock adjustments can produce out-of-order timestamps, which would confuse compaction's "latest wins" logic. |

| struct | Binary packing with HEADER_FORMAT = "<dII" (16 bytes). |

| os.fsync | Durability guarantee when sync_writes is enabled. |

| self.activefile | Must be an open file handle in "ab" mode. Managed by openactivefile(). |

Notable Design Choices

Beliefs