BottommostLevelCompaction enum and CompactionJob::isbottommostlevel flag show the production implementation of this conceptDate: 2026-05-29
Time: 09:02
The observations are insufficient to directly explain RocksDB's BottommostLevelCompaction enum or CompactionJob::isbottommostlevel flag — this codebase contains no RocksDB code, and the grep for bottom returned zero relevant hits. However, the reference implementations here demonstrate the *underlying problem* that those RocksDB constructs solve.
The key behavior lives in the compaction logic across two files:
log-structured-merge-tree/lsm.py:340 — During compaction, tombstones are removed unconditionally:
# Remove tombstones during compaction
This is the simple case: the LSM tree has a flat list of SSTables (self._sstables), and compact() merges *all* of them into one. Since there's nothing beneath the merged output, it's always safe to drop tombstones. Every compaction is implicitly a "bottommost level" compaction.
sstable-and-compaction/sstable.py — The CompactionManager introduces *leveled compaction* (line 1, lines 108–120 in tests), where SSTables are organized into levels. The test at testsstable.py:117 shows runcompaction() promoting L0 SSTables to level 1 (result[0].level == 1). The mergesstables function at testsstable.py:43 accepts a remove_tombstones=True parameter — an explicit choice about whether tombstones survive the merge.
In a leveled compaction scheme, you cannot safely remove tombstones during compaction *unless* you're compacting at the bottommost level. If you drop a tombstone at level 1, an older copy of that key might still exist at level 2 — and with the tombstone gone, the deleted key silently reappears.
This reference implementation sidesteps the problem in two ways:
1. lsm.py has no levels — compact() (triggered at lsm.py:316 when len(self.sstables) >= self.compaction_threshold) merges everything, so tombstone removal is always safe.
2. sstable.py exposes removetombstones as a caller-controlled flag rather than deriving it from level metadata. The test at testsstable.py:43 passes remove_tombstones=True explicitly during merge — there's no automatic reasoning about whether deeper levels might hold stale data.
In RocksDB, BottommostLevelCompaction is an enum that controls *when* the engine bothers compacting the deepest level (since it's expensive and only needed for space reclamation), and CompactionJob::isbottommostlevel is a runtime flag that tells the compaction job "you're at the bottom — it's safe to drop tombstones and perform other cleanup." These two constructs automate what this codebase handles manually or avoids entirely.
The missing piece in these implementations is level-aware tombstone safety. Neither implementation tracks whether a compaction output sits above other data that might contradict tombstone removal. A production system like RocksDB must answer: "Does any level below me contain keys that overlap with this compaction's key range?" If yes, tombstones must be preserved. The isbottommostlevel flag encodes exactly that answer.
lsm-flat-compaction-always-bottommost — LSMTree.compact() in lsm.py merges all SSTables into one without level hierarchy, making every compaction implicitly a bottommost-level operation where tombstone removal is always safesstable-merge-tombstone-flag-is-caller-controlled — mergesstables in sstable.py takes an explicit removetombstones boolean rather than deriving tombstone safety from level metadata, shifting correctness responsibility to the callerleveled-compaction-promotes-to-level-1 — The leveled compaction strategy in CompactionManager promotes L0 SSTables to level 1 after compaction (result[0].level == 1 in test), but the codebase does not implement deeper level hierarchies or cross-level overlap checksno-bottommost-level-detection-exists — Neither compaction implementation contains logic to determine whether a compaction output is the bottommost level — the concept that RocksDB's isbottommostlevel flag encodes is absent from this codebase