dict is not equivalent (GIL doesn't help across processes)Date: 2026-05-29
Time: 12:31
I have enough context to write the explanation. Here it is:
entries/2026/05/29/topic-bitcask-paper-comparison.md — Full side-by-side comparison of Erlang-specific production features (ETS keydir, lock files, background merge) vs. Python simplificationshash-index-storage/bitcask.py:rebuildindex — The sequential startup loop that would be parallelized with ETS in Erlang; trace the for-loop to see exactly why shared state is required for parallelizationbeam-scheduler-binding — How BEAM scheduler threads bind to CPU cores via +sbt flags, and how this affects cache locality during parallel ETS writes from hint file loadersentries/2026/05/29/topic-concurrent-merge-safety.md — Analysis of what breaks when compaction and reads happen concurrently; demonstrates why Python's lack of shared-state primitives prevents background mergepython-multiprocessing-shared-memory — Study Python 3.8+'s multiprocessing.shared_memory module as the closest analog to ETS; compare the manual serialization burden against ETS's transparent term storageets-is-zero-copy-across-processes — ETS tables are shared physical memory within a BEAM node; multiple Erlang processes reference the same hash table without serialization or copying, unlike Python's dict which is always process-localgil-does-not-cross-process-boundaries — Python's GIL serializes threads within a single OS process but provides no synchronization across multiprocessing processes, making it irrelevant for the multi-process access pattern that production Bitcask requirespython-bitcask-rebuilds-keydir-every-open — Both Python implementations reconstruct the entire in-memory index from disk on each instantiation because there is no shared-memory mechanism to preserve it across instancesets-insert-is-per-key-atomic — A single ets:insert/2 call atomically replaces one row without external locking, providing the correctness guarantee that enables concurrent hint file loading into a shared keydir