Module: Rubino::Util::AtomicFile
- Defined in:
- lib/rubino/util/atomic_file.rb
Overview
Crash- and concurrency-safe writes to a shared state file.
Several files in rubino are read-modify-written by commands the user can legitimately run in parallel: the skills provenance ledger (‘.sources.json`), the YAML config (`config.yml`). A plain read → mutate → `File.write` has two defects under concurrency:
* lost update — two writers read the same base, each writes its own
mutation, the second clobbers the first (e.g. 4 parallel
`skills install` → only the last 2 ledger entries survive); and
* torn file — a writer interrupted mid-`write` (or interleaved with a
reader) leaves a half-written, unparseable file that bricks every
later command (e.g. corrupt `config.yml`).
‘update` fixes both with the standard POSIX recipe:
1. `flock(LOCK_EX)` on a dedicated `<target>.lock` sibling — a separate
file so the lock outlives any rename/replace of the data file and a
reader's `LOCK_SH` never races the writer's rename of the data file
itself. The whole read-modify-write runs under the lock, so writers
serialize and none reads a base another is about to overwrite.
2. write the new contents to a temp file IN THE SAME DIRECTORY (so the
final rename is same-filesystem, hence atomic), then `fsync` it.
3. `File.rename(tmp, target)` — atomic on POSIX: a concurrent reader sees
either the whole old file or the whole new one, never a torn mix.
4. `fsync` the directory so the rename survives a crash.
Readers that want a consistent snapshot can take ‘LOCK_SH` over the same lock via `.read_shared`; a plain `File.read` is also safe against tearing because the rename is atomic (it just may observe a slightly stale file).
Class Method Summary collapse
-
.fsync_dir(dir) ⇒ Object
fsync the directory so the rename is durable.
-
.read_shared(path) ⇒ Object
Reads
pathunder a shared lock (so it can’t observe a concurrent writer’s intermediate state). -
.update(path) ⇒ Object
Serialized read-modify-write of
path. - .with_lock(path, mode) ⇒ Object
-
.write_atomic(path, contents) ⇒ Object
Write
contentstopathvia temp-file + atomic rename, fsyncing the temp file and the directory.
Class Method Details
.fsync_dir(dir) ⇒ Object
fsync the directory so the rename is durable. Best-effort: some platforms/filesystems refuse to open a dir for fsync (e.g. Windows), in which case durability of the rename degrades but correctness of the atomic swap is unaffected.
122 123 124 125 126 |
# File 'lib/rubino/util/atomic_file.rb', line 122 def fsync_dir(dir) File.open(dir, &:fsync) rescue StandardError nil end |
.read_shared(path) ⇒ Object
Reads path under a shared lock (so it can’t observe a concurrent writer’s intermediate state). Returns the contents, or nil when absent.
57 58 59 60 61 62 63 |
# File 'lib/rubino/util/atomic_file.rb', line 57 def read_shared(path) return nil unless File.file?(path) with_lock(path, File::LOCK_SH) do File.file?(path) ? File.read(path) : nil end end |
.update(path) ⇒ Object
Serialized read-modify-write of path. Yields the current file contents (a String, or nil when the file doesn’t exist yet) while holding an exclusive lock, and atomically writes back whatever the block returns. If the block returns nil the file is left untouched (no-op write). Returns the block’s value.
45 46 47 48 49 50 51 52 53 |
# File 'lib/rubino/util/atomic_file.rb', line 45 def update(path) FileUtils.mkdir_p(File.dirname(path)) with_lock(path, File::LOCK_EX) do current = File.file?(path) ? File.read(path) : nil new_contents = yield(current) write_atomic(path, new_contents) unless new_contents.nil? new_contents end end |
.with_lock(path, mode) ⇒ Object
107 108 109 110 111 112 113 114 115 116 |
# File 'lib/rubino/util/atomic_file.rb', line 107 def with_lock(path, mode) lock_path = "#{path}.lock" FileUtils.mkdir_p(File.dirname(lock_path)) File.open(lock_path, File::RDWR | File::CREAT, 0o600) do |lock| lock.flock(mode) yield ensure lock.flock(File::LOCK_UN) end end |
.write_atomic(path, contents) ⇒ Object
Write contents to path via temp-file + atomic rename, fsyncing the temp file and the directory. Standalone (no lock) for callers that already hold one, or that only need crash-safety, not serialization.
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
# File 'lib/rubino/util/atomic_file.rb', line 68 def write_atomic(path, contents) dir = File.dirname(path) FileUtils.mkdir_p(dir) # Preserve plain-write semantics on an EXISTING read-only file: a temp + # rename would otherwise sidestep the file's own 0444 (the writable dir # lets us swap it in), silently clobbering a file the user marked # read-only. Refuse with the same EACCES a File.write would have raised. existing_mode = File.exist?(path) ? File.stat(path).mode : nil raise Errno::EACCES, path if existing_mode && !File.writable?(path) tmp = File.join(dir, ".#{File.basename(path)}.#{Process.pid}.#{rand(1 << 32)}.tmp") begin File.open(tmp, File::WRONLY | File::CREAT | File::TRUNC, 0o600) do |f| # Write the bytes VERBATIM, regardless of the process's encoding # environment. The edit/multi_edit read-modify-write builds +contents+ # as a BINARY (ASCII-8BIT) buffer so untouched non-UTF-8 bytes survive # (#326). Without binmode, a process whose Encoding.default_internal is # UTF-8 (set by some locales / a `ruby -Eutf-8:utf-8`) makes IO#write # TRANSCODE that binary buffer ASCII-8BIT→UTF-8 and raise # Encoding::UndefinedConversionError on the first high byte (e.g. the # `\xC3` of a `José` on an edited line) — an intermittent, env-driven # in-session crash that never reproduces where default_internal is nil. # binmode pins the stream to raw bytes (external ASCII-8BIT, no internal # transcode), so an accented edit writes its exact bytes everywhere. f.binmode f.write(contents) f.flush f.fsync end # Carry the original file's permission bits across the replace so an # in-place edit doesn't silently re-chmod the user's file to 0600. File.chmod(existing_mode & 0o777, tmp) if existing_mode File.rename(tmp, path) fsync_dir(dir) ensure FileUtils.rm_f(tmp) end end |