Date: 2026-05-29
Time: 11:08
In event sourcing, the canonical way to get current state is to replay *every* event from the beginning. For a stream with thousands of events, this is expensive. Snapshot-accelerated reconstruction solves this by periodically saving the derived state at a known position, then only replaying the events that came *after* that checkpoint.
The mechanism lives in the Projection class (event-sourcing-store/event_store.py).
save_snapshot() (line 193) captures two things as a pair:
self._store._snapshots[self.name] = {
"state": copy.deepcopy(self._state), # the materialized view
"position": self._position, # which event we're caught up to
}
The copy.deepcopy is critical — it creates an independent copy of the state dict so future mutations during catchup() don't corrupt the snapshot. The position records the eventid of the last event folded into this state.
When a new Projection instance needs to reconstruct state, it calls loadsnapshot() (line 202), which restores both state and position from the snapshot. Then catchup() (line 172) does the key trick:
events = self._store.read_all(from_position=self._position + 1)
Because position was set to the snapshot's position (say, event 5), readall starts at event 6 — skipping every event the snapshot already incorporates. The projection then folds only the tail of the event log into the already-initialized state.
testsnapshotsaveloadand_resume (test file, line 124) walks through the full cycle:
1. Seed 5 events → balance reaches 130
2. proj.catch_up() → processes all 5, position = 5
3. proj.save_snapshot() → freezes {state: {"account:1": 130}, position: 5}
4. Create a fresh p2, call p2.load_snapshot() → state is 130, position is 5 *without replaying anything*
5. Append event 6 (withdraw 20)
6. p2.catch_up() → replays *only* event 6, arriving at 110
The reconstruction cost dropped from 6 events to 1.
The Projection constructor accepts snapshotinterval (line 157). During catchup(), a counter tracks events processed since the last snapshot (line 179–182):
self._events_since_snapshot += 1
if self._snapshot_interval and self._events_since_snapshot >= self._snapshot_interval:
self.save_snapshot()
self._events_since_snapshot = 0
This keeps the tail length bounded — at most snapshot_interval events need replaying on reconstruction.
The reconstructstate function (used in testtemporalquery, test line 107) is the non-accelerated path: it replays all events from the start up to a given eventid. This is the right approach for *temporal queries* ("what was the balance after event 2?") but too expensive for "what is the balance *now*" on a long-lived stream.
Snapshots are stored in-memory on the EventStore instance (self.store.snapshots), not persisted to disk. This means snapshots survive within a process lifetime but not across restarts — a restarted system replays from scratch once, takes a snapshot, and subsequent reconstructions are fast. A production system would persist snapshots alongside the event log.