Function: rangescan in range-partitioning/rangepartitioning.py

Date: 2026-05-29

Time: 13:59

Partition.range_scan

Purpose

range_scan performs a sorted range query over a partition's in-memory key-value store. It returns all entries where start <= key < end (half-open interval), which is the standard convention in range-partitioned systems. This exists because range queries are the primary advantage of range partitioning over hash partitioning — co-locating lexicographically adjacent keys makes scans efficient.

Contract

Preconditions:

Postconditions:

Invariant: The result set is exactly {(k, v) | k in self._keys, start <= k < end}, ordered by key.

Parameters

| Parameter | Type | Meaning |

|-----------|------|---------|

| start | str | Inclusive lower bound of the scan range |

| end | Optional[str] | Exclusive upper bound. None means "scan to the end of the partition" |

Edge cases:

Return Value

list[tuple[str, Any]] — a list of (key, value) pairs in ascending key order. Always a fresh list; can be empty. The caller does not need to handle errors — this method cannot raise under normal conditions.

Algorithm

1. Find left boundary: bisect.bisectleft(self.keys, start) locates the index of the first key >= start. This is the inclusive start of the result window.

2. Find right boundary:

3. Slice and zip: self.keys[left:right] and self.values[left:right] extract the matching keys and values as parallel slices, then zip pairs them up and list() materializes the result.

The algorithm is O(log n) for the two binary searches plus O(k) for materializing k result pairs.

Side Effects

None. This is a pure read operation — no mutation of keys, values, or any other state.

Error Handling

No explicit error handling. The method relies on bisect_left and Python slice semantics being safe for all inputs (out-of-range slices return empty lists, not exceptions). There are no raised exceptions.

Usage Patterns

This method is called in two contexts:

1. Directly on a partition — when you know which partition to scan.

2. From RangePartitionedStore.rangescan (line ~128) — the store identifies all partitions whose key ranges overlap [startkey, endkey), then calls partition.rangescan(startkey, endkey) on each and concatenates results. Note that the store passes the *global* startkey/endkey to every partition, not the partition's own boundaries — this works because bisect_left naturally clamps: keys outside the partition simply aren't found, so the slice is empty or partial as appropriate.

Caller obligation: The caller must ensure start and end are comparable strings. Since keys is list[str], passing a non-string would cause a TypeError from the comparison operators inside bisectleft.

Dependencies

Assumptions Not Enforced by Types

1. keys is sorted. Nothing in the type system guarantees this. If someone appends directly to keys bypassing put, range_scan returns incorrect results silently.

2. keys and values are parallel. If their lengths diverge, zip silently truncates to the shorter list — you lose data without any error.

3. start < end when end is not None. No validation. Reversed ranges silently return empty results rather than raising.

4. String comparisons are sufficient for ordering. The code uses Python's default lexicographic string comparison. If keys represent numbers (e.g., "9" vs "10"), the ordering is lexicographic, not numeric — "9" > "10" is True.

Topics to Explore

Beliefs