Module: Rubino::Util::SpillStore
- Defined in:
- lib/rubino/util/spill_store.rb
Overview
Lifecycle for the on-disk “spill” artifacts rubino writes outside the database (#374):
* tool-result spills — <home>/tool-results/<call_id>.txt, the full
pre-truncation output the model can `read` back (ToolExecutor).
* oversized pastes — <home>/sessions/<id>/paste_N.txt, a big paste
the model reads instead of inlining (UI::PasteStore).
Both were write-only: nothing ever deleted them. A long-running session or a CI box that runs thousands of large-output tools accumulated these files FOREVER, and destroying a session (CleanupSessionsJob / Repository#destroy!) only deleted DB rows, leaving the files orphaned. This module:
1. deletes a single session's spill+paste files when it is destroyed
(#destroy_session_files), and
2. evicts spill/paste files past an age and/or total-size budget
(#evict!), called opportunistically and from CleanupSessionsJob.
All methods are best-effort: an IO error must never take down the agent.
Constant Summary collapse
- DEFAULT_MAX_AGE_SECONDS =
Default eviction policy. Tunable via the cleanup config, but these are the safe built-ins: drop anything older than the retention window, and keep the combined on-disk footprint of spills+pastes under the budget by evicting oldest-first.
7 * 86_400
- DEFAULT_MAX_TOTAL_BYTES =
7 days
512 * 1024 * 1024
Class Method Summary collapse
-
.collect_files ⇒ Object
All spill + paste files as size:, mtime: records.
-
.destroy_session_files(session_id, call_ids: []) ⇒ Object
Removes the on-disk spill + paste artifacts owned by
session_idwhen the session is destroyed (#374). -
.evict!(max_age_seconds: DEFAULT_MAX_AGE_SECONDS, max_total_bytes: DEFAULT_MAX_TOTAL_BYTES, now: Time.now) ⇒ Object
Evicts spill + paste files past the age and/or total-size budget.
-
.prune_empty_session_dirs ⇒ Object
Removes now-empty per-session paste dirs (a session whose only files were pastes that got evicted) so the sessions tree doesn’t fill with empty directories.
-
.sanitize_call_id(call_id) ⇒ Object
Mirrors ToolExecutor#spill_full_output’s filename sanitization so the path we delete matches the path that was written.
-
.sessions_dir ⇒ Object
The directory holding all per-session subtrees (each session’s pastes live in <sessions>/<id>/paste_N.txt).
- .stat_glob(pattern) ⇒ Object
-
.tool_results_dir ⇒ Object
The directory holding per-call tool-result spills.
Class Method Details
.collect_files ⇒ Object
All spill + paste files as size:, mtime: records.
113 114 115 116 117 118 |
# File 'lib/rubino/util/spill_store.rb', line 113 def collect_files out = [] out.concat(stat_glob(File.join(tool_results_dir, "*.txt"))) out.concat(stat_glob(File.join(sessions_dir, "*", "paste_*.txt"))) out end |
.destroy_session_files(session_id, call_ids: []) ⇒ Object
Removes the on-disk spill + paste artifacts owned by session_id when the session is destroyed (#374). Pastes are session-scoped so the whole <sessions>/<id> subtree goes; tool-result spills are keyed by call_id, so the caller passes the session’s call_ids (looked up before the DB rows are deleted) and we remove the matching <tool-results>/<call_id>.txt. Best-effort; returns nil.
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# File 'lib/rubino/util/spill_store.rb', line 53 def destroy_session_files(session_id, call_ids: []) return if session_id.nil? || session_id.to_s.empty? FileUtils.rm_rf(File.join(sessions_dir, session_id.to_s)) Array(call_ids).each do |cid| safe = sanitize_call_id(cid) next if safe.nil? FileUtils.rm_f(File.join(tool_results_dir, "#{safe}.txt")) end nil rescue StandardError => e Rubino.logger&.warn(event: "spill_store.destroy_failed", error: e.) nil end |
.evict!(max_age_seconds: DEFAULT_MAX_AGE_SECONDS, max_total_bytes: DEFAULT_MAX_TOTAL_BYTES, now: Time.now) ⇒ Object
Evicts spill + paste files past the age and/or total-size budget. Age first (drop everything older than max_age), then size (if the survivors still exceed max_total_bytes, delete oldest-first until under budget). Empty per-session paste dirs left behind are pruned. Best-effort; returns the number of files deleted.
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
# File 'lib/rubino/util/spill_store.rb', line 74 def evict!(max_age_seconds: DEFAULT_MAX_AGE_SECONDS, max_total_bytes: DEFAULT_MAX_TOTAL_BYTES, now: Time.now) files = collect_files deleted = 0 if max_age_seconds&.positive? cutoff = now - max_age_seconds files.reject! do |f| next false unless f[:mtime] < cutoff FileUtils.rm_f(f[:path]) deleted += 1 true end end if max_total_bytes&.positive? total = files.sum { |f| f[:size] } if total > max_total_bytes # Oldest first until back under budget. files.sort_by! { |f| f[:mtime] } files.each do |f| break if total <= max_total_bytes FileUtils.rm_f(f[:path]) total -= f[:size] deleted += 1 end end end prune_empty_session_dirs deleted rescue StandardError => e Rubino.logger&.warn(event: "spill_store.evict_failed", error: e.) deleted end |
.prune_empty_session_dirs ⇒ Object
Removes now-empty per-session paste dirs (a session whose only files were pastes that got evicted) so the sessions tree doesn’t fill with empty directories. Never touches a dir that still has contents.
134 135 136 137 138 139 140 141 142 143 |
# File 'lib/rubino/util/spill_store.rb', line 134 def prune_empty_session_dirs Dir.glob(File.join(sessions_dir, "*")).each do |dir| next unless File.directory?(dir) next unless (Dir.entries(dir) - %w[. ..]).empty? Dir.rmdir(dir) rescue StandardError nil end end |
.sanitize_call_id(call_id) ⇒ Object
Mirrors ToolExecutor#spill_full_output’s filename sanitization so the path we delete matches the path that was written.
147 148 149 150 |
# File 'lib/rubino/util/spill_store.rb', line 147 def sanitize_call_id(call_id) id = call_id.to_s.gsub(/[^a-zA-Z0-9_.-]/, "_") id.empty? ? nil : id end |
.sessions_dir ⇒ Object
The directory holding all per-session subtrees (each session’s pastes live in <sessions>/<id>/paste_N.txt).
43 44 45 |
# File 'lib/rubino/util/spill_store.rb', line 43 def sessions_dir File.join(Rubino.home_path, "sessions") end |
.stat_glob(pattern) ⇒ Object
120 121 122 123 124 125 126 127 128 129 |
# File 'lib/rubino/util/spill_store.rb', line 120 def stat_glob(pattern) Dir.glob(pattern).filter_map do |path| stat = File.stat(path) next unless stat.file? { path: path, size: stat.size, mtime: stat.mtime } rescue StandardError nil end end |