Function: merge_sstables in sstable-and-compaction/sstable.py

Date: 2026-05-28

Time: 19:01

merge_sstables — K-Way Merge of Sorted String Tables

Purpose

merge_sstables implements the core operation behind SSTable compaction: merging multiple sorted, immutable SSTable files into a single new SSTable. This is the fundamental building block that both size-tiered (STCS) and leveled (LCS) compaction strategies delegate to.

It exists because LSM-tree storage engines accumulate SSTables over time. Reads degrade as the number of SSTables grows (each must be checked), so compaction periodically merges them to reduce the count and reclaim space from obsolete or deleted entries.

Contract

Preconditions:

Postconditions:

Invariants during execution:

Parameters

| Parameter | Type | Description |

|-----------|------|-------------|

| readers | List[SSTableReader] | The SSTables to merge. Can be empty (produces an empty output). Order doesn't matter — the heap handles sorting. |

| output_path | str | Filesystem path for the merged SSTable. Parent directory must exist. |

| remove_tombstones | bool | When True, tombstones (deletion markers with value=None) are excluded from the output. Only safe during compaction when no older SSTables reference the same keys — otherwise the tombstone would be lost and the deleted key would "reappear." Defaults to False. |

Return Value

Returns an SSTableReader opened on the newly written file. The caller can use it immediately for reads. The caller is responsible for:

Algorithm

The algorithm is a k-way merge using Python's heapq min-heap:

1. Initialization — Seed the heap with the first entry from each reader's scan() iterator. Each heap element is a 5-tuple:


(key, -timestamp, source_index, entry, iterator)

2. Main loop — Pop the smallest element from the heap:

After processing each entry, pushnext pulls the next entry from that source's iterator into the heap, maintaining the invariant of one heap entry per active source.

3. Finalization — Call writer.finish() to write the sparse index and footer, then return a reader on the new file.

Key insight: Because each SSTable is already sorted by key, and the heap always yields the globally smallest (key, -timestamp), we get a correct merge in O(N log K) time where N is total entries and K is the number of input SSTables.

Side Effects

Error Handling

This function does not catch or handle errors explicitly:

Usage Patterns

Called exclusively by CompactionManager:


# Size-tiered: merge a bucket of similarly-sized SSTables
merged = merge_sstables(bucket, self._next_path())

# Leveled: merge one SSTable with its overlapping neighbors in the next level
merged = merge_sstables(merge_set, self._next_path())
merged.level = lvl + 1

The caller always sets .level on the returned reader afterward (for leveled compaction). merge_sstables itself doesn't concern itself with levels.

Neither call site passes remove_tombstones=True, so tombstones are currently preserved through compaction. In a production system, tombstones would be removed during major compaction (when merging the oldest level) but retained otherwise to prevent ghost reads from older SSTables.

Dependencies