Date: 2026-05-28
Time: 18:56
I don't have read access to the target repo, but the observations give enough to trace the crash windows. Let me analyze the two Bitcask variants and the LSM tree.
hash-index-storage/bitcask.py:compact — The full compaction method (lines 194–321) to see if there's any ordering or fsync between the delete and rename steps that the grep didn't capturelog-structured-hash-table/bitcask.py:_recover — How recovery (line ~69) rebuilds state from segments, and whether it can tolerate finding both old and merged segments simultaneouslymanifest-based-compaction — How LevelDB's MANIFEST and RocksDB's MANIFEST track compaction state atomically, and what the minimal equivalent would look like for these implementationslog-structured-merge-tree/lsm.py:compact — The full LSM compaction (line 319+) to see whether it writes new SSTables before or after deleting old ones, and whether the SSTable metadata list is updated atomicallysstable-and-compaction/sstable.py — The stcscompact (line 357) and lcscompact (line 400) methods to compare compaction strategies and whether leveled compaction has a different crash-safety profilebitcask-compaction-not-crash-safe — Both Bitcask implementations (hash-index-storage/bitcask.py, log-structured-hash-table/bitcask.py) delete old segments before renaming the active segment, creating a window where committed data can be permanently lost on crashno-compaction-manifest — None of the three storage engines (hash-index, log-structured-hash-table, LSM) use a manifest or compaction log to make the segment swap atomic; segment discovery is purely filesystem-based via directory listinglsm-compaction-duplicates-safe — The LSM tree's compaction crash window produces duplicate entries rather than data loss, because newer SSTables take read precedence over older onescrc-does-not-prevent-atomicity-bugs — The CRC32 integrity checks in log-structured-hash-table/bitcask.py detect corrupt records within a segment but provide no protection against the cross-file atomicity problem in compactionhint-file-orphan-risk — In hash-index-storage/bitcask.py, a crash between deleting a data file (line 288) and its hint file (line 290) leaves an orphaned hint that directs reads to a nonexistent data file