Function: getproof in merkle-tree/merkletree.py

Date: 2026-05-29

Time: 11:46

MerkleTree.get_proof — Merkle Inclusion Proof Generator

Purpose

get_proof constructs a Merkle inclusion proof (also called an authentication path) for a specific leaf. The proof is the minimal set of sibling hashes needed to recompute the root hash starting from that leaf. A verifier who holds the root hash can use this proof to confirm that a particular data block exists in the tree without needing the entire tree — this is the core mechanism behind efficient data verification in systems like certificate transparency logs, BitTorrent, and Dynamo-style anti-entropy protocols.

Contract

Parameters

| Parameter | Type | Description |

|-----------|------|-------------|

| index | int | Zero-based index into the *real* leaves (not the padded array). Must satisfy 0 <= index < self.leafcount. |

Edge cases: negative indices are explicitly rejected (not treated as Python-style negative indexing). An index equal to leafcount raises even though the backing array has padded slots beyond it.

Return Value

Returns a MerkleProof dataclass with:

The caller must understand that the proof is a snapshot — if update_leaf is called after proof generation, the proof becomes invalid against the new root.

Algorithm

1. Bounds check — reject out-of-range indices immediately.

2. Map logical index to array index — the tree is stored as a flat array in level-order. Leaves start at position paddedsize - 1, so leaf i lives at array index paddedsize - 1 + i.

3. Collect siblings bottom-up — walk from the leaf toward the root:

4. Package the proof — wrap the leaf hash, siblings list, and current root hash into a MerkleProof.

The key insight in step 3: the direction label tells the verifier how to reconstruct each internal hash. If the sibling is on the "right", the verifier computes H(current || sibling). If on the "left", it computes H(sibling || current). This matches how verify_proof consumes the proof.

Side Effects

None. This is a pure read operation — no mutations to hashes, data, or any other state.

Error Handling

Usage Patterns

Typical usage pairs getproof with verifyproof:


tree = MerkleTree([b"alice", b"bob", b"carol"])
proof = tree.get_proof(1)  # proof for "bob"

# Later, possibly on a different machine:
assert MerkleTree.verify_proof(b"bob", proof)

Also used in KeyRangeMerkleTree.getkeyproof, which translates a key lookup into a leaf index and delegates here.

Caller obligations:

Dependencies

Assumptions Not Enforced by the Type System

1. index is treated as a non-negative integer — Python's int permits arbitrary values; only the runtime bounds check catches violations.

2. The internal hashes array is assumed to be fully populated — there is no check for empty strings or sentinel values. If the tree were partially constructed, getproof would silently return a proof with invalid hashes.

3. Padding leaves use EMPTYHASH — the proof may include EMPTYHASH siblings for trees whose leaf count isn't a power of 2. This is correct but means a verifier could construct a valid proof for "empty" data at a padding index if the bounds check were bypassed.

Beliefs