Original source: topic-event-sourcing-snapshots

Topic: How snapshot-accelerated reconstruction would work: load snapshot, then replay only events after the snapshot position

Date: 2026-05-29

Time: 11:08

Snapshot-Accelerated Reconstruction in the Event Store

The Core Problem

In event sourcing, the canonical way to get current state is to replay *every* event from the beginning. For a stream with thousands of events, this is expensive. Snapshot-accelerated reconstruction solves this by periodically saving the derived state at a known position, then only replaying the events that came *after* that checkpoint.

How It Works: Two-Phase Reconstruction

The mechanism lives in the Projection class (event-sourcing-store/event_store.py).

Phase 1: Save a Snapshot

save_snapshot() (line 193) captures two things as a pair:


self._store._snapshots[self.name] = {
    "state": copy.deepcopy(self._state),   # the materialized view
    "position": self._position,             # which event we're caught up to
}

The copy.deepcopy is critical — it creates an independent copy of the state dict so future mutations during catchup() don't corrupt the snapshot. The position records the eventid of the last event folded into this state.

Phase 2: Load Snapshot + Replay Tail

When a new Projection instance needs to reconstruct state, it calls loadsnapshot() (line 202), which restores both state and position from the snapshot. Then catchup() (line 172) does the key trick:


events = self._store.read_all(from_position=self._position + 1)

Because position was set to the snapshot's position (say, event 5), readall starts at event 6 — skipping every event the snapshot already incorporates. The projection then folds only the tail of the event log into the already-initialized state.

The Test That Demonstrates It

testsnapshotsaveloadand_resume (test file, line 124) walks through the full cycle:

1. Seed 5 events → balance reaches 130

2. proj.catch_up() → processes all 5, position = 5

3. proj.save_snapshot() → freezes {state: {"account:1": 130}, position: 5}

4. Create a fresh p2, call p2.load_snapshot() → state is 130, position is 5 *without replaying anything*

5. Append event 6 (withdraw 20)

6. p2.catch_up() → replays *only* event 6, arriving at 110

The reconstruction cost dropped from 6 events to 1.

Automatic Snapshotting

The Projection constructor accepts snapshotinterval (line 157). During catchup(), a counter tracks events processed since the last snapshot (line 179–182):


self._events_since_snapshot += 1
if self._snapshot_interval and self._events_since_snapshot >= self._snapshot_interval:
    self.save_snapshot()
    self._events_since_snapshot = 0

This keeps the tail length bounded — at most snapshot_interval events need replaying on reconstruction.

Contrast with Full Replay

The reconstructstate function (used in testtemporalquery, test line 107) is the non-accelerated path: it replays all events from the start up to a given eventid. This is the right approach for *temporal queries* ("what was the balance after event 2?") but too expensive for "what is the balance *now*" on a long-lived stream.

A Subtle Design Choice

Snapshots are stored in-memory on the EventStore instance (self.store.snapshots), not persisted to disk. This means snapshots survive within a process lifetime but not across restarts — a restarted system replays from scratch once, takes a snapshot, and subsequent reconstructions are fast. A production system would persist snapshots alongside the event log.