{"results":[{"id":"all-three-implementations-use-payload-only-crc","text":"The WAL, Bitcask, and B-tree storage engine all compute CRC over data payloads only, excluding their respective header/framing metadata fields — a consistent pattern across the repo","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"allocate-page-no-initialization","text":"`allocate_page` returns a page number containing whatever stale data was previously on disk; the caller must overwrite it with valid page content before the next WAL commit","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"avro-skip-is-recursive-not-bytejump","text":"The decoder's `_skip` method discards unwanted fields by recursively walking the writer schema rather than using byte-size jumps, which works correctly but requires understanding the element schema","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"batch-mode-defers-fsync","text":"In `batch` sync mode, individual WAL appends do not fsync until `_write_count` reaches `_batch_sync_count` (default 100), leaving up to 99 records vulnerable to crash loss","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"batch-write-count-not-reset-on-forced-sync","text":"The WAL `_write_count` counter only resets when the batch threshold triggers a sync (line 133), not when a forced sync occurs, which may cause counter drift between forced and threshold-triggered syncs","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"binary-formats-rigid-across-entire-storage-stack","text":"The entire storage stack uses rigid binary formats that preclude both forward evolution and post-corruption recovery: WAL records are contiguously packed with no block alignment or version negotiation preventing resync after mid-file corruption, and SSTables lack per-entry checksums and efficient skip structures — neither layer can be upgraded in place or self-repaired after partial damage.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"both-storage-paradigms-hit-scalability-walls","text":"Both storage paradigms in the reference implementations exhibit fundamental scalability constraints: the hash index requires all keys in RAM (making dataset size directly bound by available memory with no spill-to-disk fallback), while the LSM tree scans every SSTable on negative lookups because the correctly-implemented Bloom filter module is never wired into the read path.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-allocate-page-outside-wal","text":"Page allocation during splits modifies metadata through `PageManager.write_meta` (not WAL-logged), creating a potential crash-safety gap if the process fails between allocation and the subsequent WAL commit","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-crc32-skips-corrupt-entries","text":"Corrupted WAL entries (CRC mismatch) are silently skipped during B-tree recovery rather than raising an error or halting replay.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-delete-no-commit-on-miss","text":"When the key is not found, `delete` writes no WAL entries and performs no commit, making failed deletes purely read-only operations.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-deserialize-no-validation","text":"Both _deserialize_leaf and _deserialize_internal perform no validation of page type, sort order, or buffer length; they trust that WAL CRC checks and the serialize functions guarantee well-formed pages, so corruption produces silent wrong results rather than errors.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-double-fsync-per-mutation","text":"B-tree mutations pay for `os.fsync()` twice: once when writing the WAL entry (`btree.py:137`) and again when committing the page to the data file (`btree.py:105`), with the WAL truncated only after the data file sync confirms durability.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-durability-protects-data-not-structure","text":"The B-tree's durability model protects user data but not structural integrity: mutations pay double fsync for data pages (WAL entry + data write) while structural metadata is never fsynced, AND structural integrity silently erodes during normal operation (leaked pages, ever-growing height, dangling parent pointers after free_page) with no defensive checks in the I/O layer to detect or prevent degradation.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-fix-plan-all-crash-safety","text":"All six bugs documented in `fix-plan.md` concern single-writer WAL/fsync crash safety; none involve concurrent access races, illustrating that even without concurrency the WAL protocol has subtle correctness requirements","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-free-page-bypasses-wal","text":"`pm.free_page()` called during deletion writes directly to the data file without WAL logging, creating a crash-consistency gap if the process fails between the free and the WAL commit","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-has-page-io-counters","text":"The B-tree's `PageManager` tracks `pages_read` and `pages_written` with a `reset_counters()` method (multiply by `page_size` for bytes), but WAL writes are not counted by any existing counter","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-impl-asymmetric-split-vs-merge","text":"The reference implementation fully propagates splits upward through `_insert`'s return value but swallows the `'empty'` signal in `_delete` after one level of cleanup, making insertion structurally complete but deletion structurally incomplete","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-insert-no-wal-commit","text":"`_insert` never calls `wal.commit()`; the caller (`put`) is solely responsible for committing after all page writes and metadata updates complete, making the entire `put` atomic with respect to crash recovery","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-metadata-bypasses-wal","text":"`BTree._write_meta` writes root pointer and free list head directly to the data file without logging through the WAL, making metadata updates not crash-safe.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null},{"id":"btree-mutation-fsync-is-asymmetric","text":"B-tree mutations pay double fsync costs for user data (WAL entry + data page) but skip fsync entirely for structural metadata, creating an asymmetry where key-value pairs survive crashes but the free-page list and allocation state may not.","truth_value":"IN","justification_count":0,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null}],"count":278,"limit":20,"offset":0}