Topic: What happens when a crash occurs mid-compaction? The delete-then-rename sequence has a window where data could be lost

Date: 2026-05-28

Time: 18:25

Crash Safety During Compaction

The Core Problem

These implementations have a fundamental crash-safety gap: they delete old data files before the new compacted data is fully committed to a discoverable location, with no manifest or journal to track which files constitute the current valid state. A crash in that window means the recovery logic can't find either the old data (deleted) or the new data (not yet renamed/registered).

The Three Implementations and Their Crash Windows

1. hash-index-storage/bitcask.py — Delete, then rename

The compact() method (starting at line 194) follows this sequence:

1. Write live entries into a new compacted data file

2. Delete old data files — line 288: os.remove(data_path)

3. Delete old hint files — line 290: os.remove(hint_path)

4. Rename — line 297: os.rename(self.datapath(oldactiveid), ...)

Crash window: If the process dies between step 2 and step 4, the old segments are gone. The new compacted file exists on disk, but it may not have the file ID that recover() (line 112) expects when it calls findfileids() (line 53) to scan for *.data files. The in-memory keydir index — rebuilt from disk on startup — will be missing every key that lived in the deleted segments.

Even within step 2 alone: if there are multiple immutable segments and the crash happens after deleting only some, rebuildindex() (line 117) will scan the surviving files but skip the deleted ones. Keys whose latest version was in a deleted segment silently vanish.

2. log-structured-hash-table/bitcask.py — Same pattern, same gap

The compact() method (line 219) does:

1. Merge frozen segments into a new compacted segment

2. Delete old segments — line 292: os.remove(seg_path)

3. Delete old hint files — line 295: os.remove(hint)

4. Rename active segment — line 301: os.rename(oldactivepath, newactivepath)

Crash window: Between lines 292 and 301. The _recover() method rebuilds the index by scanning whatever segment files exist on disk. If old segments are deleted but the rename hasn't happened, recovery sees a gap: segment IDs that the compacted file was supposed to replace are gone, and the compacted file may have an unexpected name or sequence number.

3. log-structured-merge-tree/lsm.py — Delete without manifest

The compact() method (line 319) writes a new merged SSTable, then at line 353: os.remove(sst.path) for each old SSTable.

This is slightly different — there's no rename step. The new SSTable is written first, then old ones are deleted. But there's no manifest file tracking which SSTables are current. On recovery, the LSM tree must reconstruct state from whatever files exist. If a crash happens mid-deletion:

Why This Matters: The Missing Safeguard

The safe compaction protocol (used by production systems like LevelDB/RocksDB) is:

1. Write new compacted file(s) to temporary names

2. fsync the new files

3. Atomically update a manifest (e.g., a CURRENT or MANIFEST file) that lists the active file set

4. fsync the manifest directory entry

5. Delete old files — now safe, since they're no longer referenced

None of these implementations have a manifest. The recover() / rebuild_index() methods discover files by listing the directory (os.listdir), so the directory listing *is* the implicit manifest. But directory state isn't transactional — there's no way to atomically say "these 5 files are out, this 1 file is in."

The hint files (hintpath, loadhint_file) speed up recovery but don't solve the consistency problem — they're an optimization for the index rebuild, not a transactional record of which data files are valid.

What a Fix Would Look Like

A manifest file (sometimes called CURRENT) that lists every active SSTable/segment ID. Compaction becomes:

1. Write new compacted files

2. Write new manifest listing the new file set (atomic via write-to-temp + os.replace)

3. Delete old files

On crash recovery, read the manifest to know which files to load. Orphaned files (written but not in manifest) get cleaned up. Deleted-but-still-in-manifest files indicate a bug, not a crash, and can be detected.