File: secondary-index-partitioning/secondaryindexpartitioning.py

Date: 2026-05-29

Time: 09:05

secondaryindexpartitioning.py — Explanation

Purpose

This file implements the two strategies for partitioning secondary indexes described in DDIA Chapter 6: document-partitioned (local) indexes and term-partitioned (global) indexes. It provides a side-by-side comparison of the two approaches, tracking how many partitions each operation touches — the central tradeoff Kleppmann describes: scatter/gather reads vs. multi-partition writes.

The module owns the partitioning logic, index maintenance, and operation metrics. It's a teaching implementation — no persistence, no networking, no concurrency — focused on making the structural tradeoffs observable.

Key Components

Document (lines 5–9)

A thin wrapper around a primary key and a field dictionary. Notably, the DB classes don't actually use this class — they accept pk and fields separately. It exists as a domain concept but isn't wired into the API.

Partition (lines 12–23)

Represents one partition shard. Holds three data structures:

Each partition has both index structures available, but only one is populated depending on which DB class manages it.

OperationResult (lines 26–32)

Return type for all DB operations. Bundles data (query results or None for writes), partitions_touched (the key metric), and operation type. This is how the implementation makes the DDIA tradeoff measurable.

DocumentPartitionedDB (lines 35–100)

Implements the local index strategy. Each partition maintains its own secondary index covering only documents stored on that partition.

TermPartitionedDB (lines 103–230)

Implements the global index strategy. The index is itself partitioned — each index term lives on a specific partition determined by hashing or range-partitioning the term value.

compare_strategies (lines 233–261)

Utility function that runs an identical workload (document inserts + field queries) against both DB types and returns their stats side-by-side. This is the pedagogical payoff — you can see the tradeoff numerically.

Patterns

Inverted index structure: Both strategies use the same nested-dict shape {field: {value: set(pks)}} — a classic inverted index. The difference is placement (co-located with documents vs. partitioned by term).

Partition assignment via hash modulo: hash(pk) % numpartitions for documents, hash(value) % numpartitions for terms. Simple, deterministic, no coordination needed.

Stats accumulator: Every operation increments counters, enabling get_stats() to compute averages. The stats track the DDIA claim: document-partitioned queries touch all partitions; term-partitioned writes touch multiple.

Async index with explicit flush: TermPartitionedDB supports asyncindex=True, which queues index mutations in pending instead of applying them immediately. This models the real-world approach where global indexes are updated asynchronously to avoid distributed transactions on writes. flush_index() drains the queue.

Index cleanup on update/delete: Both DB classes carefully remove stale index entries before writing new ones, preventing phantom results from old field values.

Dependencies

Imports: None beyond the standard library (uses only built-in hash, dict, set, str, chr, ord).

Imported by:

Flow

Write path (document-partitioned)


put(pk, fields)
  → hash(pk) → partition_id
  → if pk exists: remove old values from local_index
  → store fields in partition.documents[pk]
  → for each indexed field: add to partition.local_index[field][value]
  → return OperationResult(partitions_touched=1)

Write path (term-partitioned)


put(pk, fields)
  → hash(pk) → data_partition_id
  → if pk exists: for each old indexed value:
      → hash(old_value) → index_partition_id (possibly different partition)
      → remove pk from that partition's global_index
  → store fields in data_partition.documents[pk]
  → for each indexed field value:
      → hash(value) → index_partition_id
      → add pk to that partition's global_index
  → return OperationResult(partitions_touched=len(all_touched))

Query path (document-partitioned) — scatter/gather


query_by_field(field, value)
  → for EACH partition:
      → check local_index[field][value]
      → collect matching PKs + documents
  → return all results, partitions_touched = num_partitions

Query path (term-partitioned) — targeted


query_by_field(field, value)
  → hash(value) → index_partition_id
  → get PKs from that partition's global_index
  → for each PK: hash(pk) → data_partition, fetch document
  → return results, partitions_touched = 1 + distinct_data_partitions

Invariants

1. PK uniqueness per DB: A PK maps to exactly one partition (via hash(pk) % n). Storing the same PK twice overwrites; it never creates duplicates across partitions.

2. Index consistency on mutation: Every put and delete removes stale index entries before adding new ones. The index is always consistent with document state (unless asyncindex=True and flushindex hasn't been called).

3. Document-partitioned queries always touch all partitions: querybyfield in DocumentPartitionedDB unconditionally iterates every partition — there's no short-circuit.

4. Term partition assignment is deterministic: Given the same value and numpartitions, term_partition always returns the same partition. No rebalancing or dynamic routing.

5. Empty index cleanup: When the last PK is removed from an index entry (set becomes empty), the entry is deleted from the dict rather than left as an empty set.

Error Handling

Essentially none. The code trusts its callers:

This is appropriate for a teaching implementation — errors surface as standard Python exceptions rather than being wrapped.

Topics to Explore

Beliefs