compact() methods maintain this invariant or have a window where reads could failDate: 2026-05-29
Time: 08:56
I don't have read access to the target repo (/Users/ben/git/ddia-implementations), so I'll work from the detailed entries already captured in this knowledge base. The entries contain line-by-line analysis of both compact() methods.
hash-index-storage/bitcask.py:rebuildindex — Trace what happens when recovery finds both old and merged files coexisting after a mid-compaction crash, and whether it produces a correct keydirlog-structured-hash-table/bitcask.py:_recover — Whether the recovery path can tolerate the "active segment at wrong ID" state left by a crash between Phases 4 and 5 of compactionmanifest-based-compaction — How LevelDB's MANIFEST makes the old-to-new segment swap atomic and crash-safe, and what a minimal manifest for these implementations would look likelog-structured-hash-table/bitcask.py:get — Trace the read path to see exactly which file handles are used and what happens when a handle is missing or a file doesn't existreference-counted-file-handles — The Erlang Bitcask implementation uses reference counting so old segment files stay open until the last reader finishes; compare with the Python implementations' approach of eagerly closing handles during compactioncompact-closes-handles-before-keydir-fully-updated — In hash-index-storage/bitcask.py, cached file readers for old immutable files are closed at the start of the merge-write phase, before all keydir entries have been updated to point to new locations, creating a window where reads would fail even without concurrent accesslog-structured-compact-closes-active-file — In log-structured-hash-table/bitcask.py, compaction closes activefile during Phase 3, making all reads and writes to the active segment impossible until Phase 5 reopens it under a new nameneither-bitcask-has-atomic-switchover — Neither implementation maintains the paper's invariant that old files are deleted only after the keydir is fully updated; both interleave keydir mutation with file handle closure and file deletioncrash-between-delete-and-rename-loses-data — In both implementations, a crash after old segment files are deleted but before the active file is renamed leaves committed data permanently unrecoverable, because no manifest or log records the intended state transitionsingle-threaded-assumption-hides-ordering-bugs — Both implementations document "no concurrent access" as a precondition, which masks the fact that even the sequential operation ordering violates the paper's atomicity requirements and creates crash-safety gaps