writecount should reset on any fsync (including forced) or only on batch-threshold syncs — potential bugDate: 2026-05-29
Time: 08:23
writecount Reset Behavior — Potential Bug in Batch Sync ModeIn write-ahead-log/wal.py:124-134, the dosync method has a subtle logic error when sync_mode == "batch" and force=True:
def _do_sync(self, force: bool = False):
if self._sync_mode == "sync" or force: # line 126
self._fd.flush()
os.fsync(self._fd.fileno())
elif self._sync_mode == "batch": # line 129
self._write_count += 1
if self._write_count >= self._batch_sync_count:
self._fd.flush()
os.fsync(self._fd.fileno())
self._write_count = 0 # line 134
The if/elif structure means the two branches are mutually exclusive. When syncmode == "batch" and force=True, the first branch fires (because force is true), performs the fsync, but never touches writecount. The elif branch — the only place where write_count is incremented or reset — is skipped entirely.
Two callers pass force=True:
append_batch (wal.py:165) — atomic batch writes with a COMMIT record must be durable immediatelycheckpoint (wal.py:175) — checkpoint records must also be durableRegular append calls (wal.py:149) use the default force=False, so in batch mode they go through the elif branch and correctly manage the counter.
With batchsynccount=100:
1. 50 individual append calls → writecount reaches 50
2. appendbatch fires → forced fsync flushes everything to disk, but write_count stays at 50
3. 50 more append calls → writecount reaches 100, triggers a batch sync
That batch sync at step 3 fires after only 50 unsynced writes, not 100. The 50 writes from step 1 were already durable after the forced fsync in step 2, but they're still counted toward the threshold.
This is a performance bug, not a correctness bug. Data is never lost — the forced sync guarantees durability, and the premature batch sync just adds an unnecessary extra fsync. In workloads that mix append with append_batch or checkpoint, batch mode will sync more frequently than the configured threshold intends.
The severity depends on workload shape. If forced syncs are rare relative to batchsynccount, the effect is negligible. If they're frequent (e.g., many small batches), the counter almost never reaches the threshold organically — it keeps getting "frozen" by forced syncs and then slowly climbing from a stale base.
Reset writecount in the forced-sync path when in batch mode:
def _do_sync(self, force: bool = False):
if self._sync_mode == "sync" or force:
self._fd.flush()
os.fsync(self._fd.fileno())
if force and self._sync_mode == "batch":
self._write_count = 0
elif self._sync_mode == "batch":
self._write_count += 1
if self._write_count >= self._batch_sync_count:
self._fd.flush()
os.fsync(self._fd.fileno())
self._write_count = 0
This ensures the counter reflects reality: after any fsync, there are zero unsynced writes pending.
For comparison, hash-index-storage/bitcask.py:88 takes a simpler approach — it syncs on every write when sync_writes=True, with no batching mode at all. The LSM tree's WAL (log-structured-merge-tree/lsm.py:27) also does a flush() per write with no fsync batching. The batch-sync logic is unique to the WAL implementation, which is why this counter-management bug only surfaces here.
write-ahead-log/wal.py:dosync — Trace all callers to verify no other paths bypass the counterwrite-ahead-log/wal.py:rotate — Rotation does its own fsync at line 115 without going through dosync, so write_count is also not reset on rotationwrite-ahead-log/test_wal.py — Check whether tests cover batch-mode behavior after forced syncs (line 106 tests all three sync modes but may not exercise the interleaving)write-ahead-log/wal.py:truncate — Another fsync at line 184 that bypasses dosync and doesn't reset the countersync-mode-none-safety — When syncmode="none", do_sync with force=False does nothing — verify callers that need durability always pass force=Truewal-write-count-not-reset-on-forced-sync — In batch sync mode, writecount is not reset when a forced fsync occurs via append_batch or checkpoint, causing the next batch-threshold sync to fire prematurelywal-do-sync-branches-mutually-exclusive — The dosync method's if/elif structure means forced syncs in batch mode skip all counter logic — the counter is neither incremented nor resetwal-rotate-bypasses-do-sync — File rotation at wal.py:115 calls os.fsync directly without going through dosync, so writecount is not reset on rotation eitherwal-batch-bug-is-performance-not-correctness — The stale writecount causes extra fsyncs (premature batch threshold) but never causes data loss, since forced syncs always flush to disk regardless of the counter