{"results":[{"id":"async-index-defers-mutations","text":"With `async_index=True`, `TermPartitionedDB` queues index operations in `_pending` instead of applying them immediately; `flush_index()` must be called to drain the queue, modeling asynchronous global index updates.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-compact-not-crash-safe","text":"Hash-index-storage compaction is non-atomic with no error handling: a crash between deleting old files and completing the rewrite can leave the store in an unrecoverable state with no rollback mechanism.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-compact-writes-hint-files","text":"Compaction produces `.hint` files alongside each merged data file (including mid-compaction when a merged file hits `max_file_size`), enabling O(keys) index rebuilds instead of O(records) full scans on next startup.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-crash-recovery-without-hints","text":"`BitcaskStore` can rebuild its in-memory index by scanning `.data` files alone when `.hint` files are missing, producing identical read results to a clean startup with hint files present.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-get-single-disk-seek","text":"`get()` performs exactly one disk seek per call; the key-to-offset mapping is resolved entirely in memory via `self._index`, making reads O(1) index lookup plus one disk read","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-hint-files-exclude-tombstones","text":"`create_hint_files()` skips tombstone records and only emits entries whose index currently points to that segment+offset, so hint-based recovery cannot distinguish \"key was deleted\" from \"key never existed in this segment\"","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-hint-files-skip-scan","text":"When a `.hint` file exists for a file ID, `_rebuild_index` loads it instead of scanning the data file; hint files have no checksum validation, so they must be written atomically with their data files during compaction.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-keydir-is-sole-index","text":"The in-memory `keydir` dict is the only index; every live key has exactly one `KeyEntry` pointing to its most recent non-tombstone record on disk.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-no-checksum-validation","text":"Records have no CRC or integrity checksum; the only corruption guard is the `assert read_key == key` in `get()`, which catches index/data mismatches but not bit-rot.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-rebuild-sorts-ascending","text":"`_rebuild_index` sorts file IDs ascending before scanning so that newer records overwrite older ones in the keydir, enforcing last-write-wins semantics.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-recovery-order-is-ascending","text":"Recovery scans segments in ascending ID order so that newer writes overwrite older index entries, and the highest-ID segment becomes the active segment for appending","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-tombstone-invisible-to-get","text":"`get()` never encounters tombstone records because `delete()` removes the key from `self._index`; tombstone handling is solely a recovery and compaction concern","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"bitcask-write-record-no-index-update","text":"`_write_record` only appends to disk and returns a byte offset; it never modifies `self._index`, leaving index management entirely to callers (`put`, `delete`, `compact`)","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"both-storage-paradigms-hit-scalability-walls","text":"Both storage paradigms in the reference implementations exhibit fundamental scalability constraints: the hash index requires all keys in RAM (making dataset size directly bound by available memory with no spill-to-disk fallback), while the LSM tree scans every SSTable on negative lookups because the correctly-implemented Bloom filter module is never wired into the read path.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-delete-cleanup-depth2-only","text":"Empty leaf cleanup only triggers when the current internal node is at depth 2 (direct parent of leaves) and the empty child is not the leftmost; empty leaves at depth > 2 or at index 0 are silently left in place","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-traversal-is-forward-only","text":"B-tree sequential access is strictly forward-only: iteration descends the left spine (following children at index zero at every internal level) to reach the leftmost leaf, then walks the forward-only sibling chain, with no backward pointer, reverse iterator, or random leaf access mechanism.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"cdc-backbone-both-heuristic-and-insufficient","text":"The CDC backbone that all derived systems depend on has two independent reliability gaps: event type semantics are determined by reconstruction heuristics rather than explicit markers (insert vs update distinguished by old_value presence, tombstones reported as None, snapshots use sentinel sequence numbers), AND the consistency requirements of derived systems (explicit flush to make mutations visible, old values for index maintenance) aren't reliably met by the CDC infrastructure.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"cdc-old-value-required-for-index-consistency","text":"`SecondaryIndex.process_event` depends on `CDCEvent.old_value` to remove stale index entries during updates; without before-images, incremental index maintenance produces phantom references","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"cdc-search-index-full-reindex-on-update","text":"`SearchIndex` removes all old tokens and re-adds all new tokens on every UPDATE event, even if only non-indexed columns changed — correct but not optimized for partial changes.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"compact-closes-handles-before-keydir-fully-updated","text":"In `hash-index-storage/bitcask.py`, cached file readers for old immutable files are closed at the start of the merge-write phase, before all keydir entries have been updated to point to new locations, creating a window where reads would fail even without concurrent access","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null}],"count":120,"limit":20,"offset":0}