Date: 2026-05-29
Time: 08:30
dosync — Write-Ahead Log Fsync Policydosync controls when buffered WAL writes are durably persisted to disk. It exists because there's a fundamental tension in WAL design: calling fsync after every write guarantees durability but kills throughput; skipping it risks data loss on crash. This method encapsulates the durability policy so callers (append, append_batch, checkpoint) don't need to know which sync strategy is active.
Preconditions:
self.fd is an open file descriptor in append-binary mode ("ab"). No null check is performed — calling this with fd = None will raise AttributeError.self.syncmode is one of "sync", "batch", or implicitly any other string (which results in no sync at all — a silent "none" mode).Postconditions:
"sync" mode or when force=True: all buffered data is flushed to the OS and fsync'd to stable storage. The data survives a process crash, OS crash, or power loss (assuming the storage hardware honors fsync)."batch" mode without force: the write counter increments. If it hits the threshold, data is fsynced and the counter resets. Otherwise, data remains in the userspace/kernel buffer — not durable."none"): nothing happens. Data sits in Python's write buffer.Invariant: self.writecount is always in [0, self.batchsync_count - 1] after a non-forced call in batch mode.
| Parameter | Type | Default | Meaning |
|-----------|------|---------|---------|
| force | bool | False | Bypasses the sync mode policy and forces an immediate fsync. Used by append_batch and checkpoint to guarantee atomicity/durability of critical records regardless of the configured mode. |
Edge case: If force=True and syncmode == "batch", the force path (syncmode == "sync" or force) takes precedence, but writecount is not reset. This means the batch counter keeps accumulating across forced syncs, which is harmless but slightly imprecise — the next batch-triggered sync may come earlier than expected.
None. This is a side-effect-only method.
1. If sync_mode is "sync" OR force is True:
→ flush Python's internal buffer to the OS
→ fsync the file descriptor (block until hardware confirms write)
→ DONE
2. Else if sync_mode is "batch":
→ increment _write_count
→ if _write_count >= _batch_sync_count:
→ flush + fsync (same as above)
→ reset _write_count to 0
→ DONE
3. Otherwise (implicit "none" mode):
→ do nothing — writes remain buffered
The two-step flush() then fsync() is necessary because Python's io layer maintains its own buffer separate from the OS page cache. flush() pushes data from Python → kernel; fsync() pushes data from kernel → disk.
self._fd.flush() and os.fsync(), which may block for milliseconds to seconds depending on the storage device and write queue depth.self.writecount. This is the only mode with internal state changes.self.lock. It relies on its callers (append, appendbatch, checkpoint) to hold the lock. This is a correctness requirement that isn't enforced by the method signature.No exceptions are caught. Both flush() and os.fsync() can raise OSError (disk full, I/O error, bad file descriptor), which will propagate directly to the caller. This is the correct behavior — a failed sync in a WAL is a critical error that the application must handle, not swallow.
Three call sites, each with a different durability need:
# Single record — respects configured policy
def append(self, ...):
self._fd.write(data)
self._do_sync() # might not actually sync in batch mode
# Atomic batch — must be durable
def append_batch(self, ...):
self._fd.write(bytes(buf))
self._do_sync(force=True) # always syncs, regardless of mode
# Checkpoint marker — must be durable
def checkpoint(self):
self._fd.write(...)
self._do_sync(force=True) # always syncs
The pattern is: individual writes tolerate deferred durability (you might lose the last few writes on crash), but batch commits and checkpoints must be durable immediately because downstream consumers rely on their presence to determine recovery boundaries.
os.fsync — POSIX fsync(2) wrapper. On Linux this is a true fsync; on macOS it's fcntl(F_FULLFSYNC) only if explicitly called (Python's os.fsync maps to fsync(2), which on macOS does not guarantee flush to platter unless the drive firmware cooperates). This is a subtle portability assumption.self._fd — a Python file object opened in "ab" mode via the built-in open().1. syncmode is a free-form string, not an enum. Passing "SYNC" or "Sync" silently falls into the no-op branch. There's no validation at construction time either.
2. fd is never None when this method is called. The truncate method sets fd = None temporarily, and if dosync were called during that window, it would crash.
3. Callers hold self._lock. Thread safety depends entirely on caller discipline.
4. force=True in batch mode doesn't reset writecount, which means the counter drifts. Not a bug, but a subtlety.