Class: Pikuri::Workspace::Filesystem
- Inherits:
-
Object
- Object
- Pikuri::Workspace::Filesystem
- Defined in:
- lib/pikuri/workspace/filesystem.rb
Overview
Defines which paths the agent can see and write to. Constructed with explicit readable / writable prefix lists; every Read/Write/Edit/ Grep/Glob/Bash path the agent supplies is checked against those lists before touching the filesystem. Returned Pathnames are absolute, post-symlink-resolution.
Project root, readable, writable
project_root is the writable containment ceiling — automatically folded into both readable and writable; you can read and write anywhere under the project unconditionally. It is also the base for resolving relative paths supplied by the LLM, and the chdir target tools like Bash / Grep / Glob pass to the subprocess. There is deliberately no separate “cwd” concept: hosts that want the agent to operate inside a specific subtree pass that subtree as project_root. bin/pikuri-code enforces this by Dir.chdir‘ing the Ruby process to the discovered project root at startup if it differs from the launch Dir.pwd, so the workspace and the surrounding process agree on one anchor.
The extra readable list grants read-only access to additional roots (system toolchains, dependency caches, skill catalogs); the extra writable list grants read+write to additional roots (other project directories the agent should be able to touch).
Session umbrella (#internal_temp)
Every workspace owns a per-process umbrella dir at ~/.cache/pikuri/workspace-XXX/ (#internal_temp). It is minted lazily on first access — workspaces that never touch it (most specs, hosts that don’t want a playground and don’t use the bubblewrap overlay) pay nothing — and removed by a single at_exit handler when the process exits. Everything ephemeral this workspace produces lives inside the umbrella, so one remove_entry at process exit cleans the lot:
-
#temp — the LLM-visible playground subdir, present only when temp: true (see below).
-
The bubblewrap sandbox’s per-toolchain overlay state (+overlay-<slug>/upper,work+ for ~/.gradle/caches, ~/.m2/repository, …), used to keep cross-project toolchain caches isolated to one pikuri-code session. See Code::Bash::Sandbox::Bubblewrap.
The umbrella deliberately lives in ~/.cache/pikuri rather than /tmp: the Code::Bash::Sandbox::Bubblewrap sandbox binds #temp at /tmp inside the sandbox (so the LLM’s reflexive /tmp writes persist across bash calls). With the umbrella already under /tmp, that bind would land on top of itself and per-call mountpoint creation would pollute the dir recursively. ~/.cache/pikuri avoids the collision.
At gem load, Filesystem.sweep_stale_internal_temps! prunes umbrella dirs older than seven days — a safety net for sessions that died before at_exit could run (SIGKILL, OOM). Recent umbrellas are left alone so a concurrent pikuri-code in another shell isn’t disturbed.
Optional temp playground
When constructed with temp: true, the workspace adds <internal_temp>/playground to writable and exposes it via #temp. The binary advertises this path to the LLM (e.g. in the system prompt) as scratch space. Default is false: specs and tests that build many workspaces don’t pay the mkdir cost, and hosts that don’t need a playground don’t get one. Either way, the umbrella is shared with everything else that wants ephemeral state — no second tempdir is minted.
Optional /tmp alias
When alias_tmp_to_temp: true AND temp: is set, file paths supplied to #resolve_for_read / #resolve_for_write that start with /tmp/ (or are exactly /tmp) are rewritten to the host #temp path before containment is checked. This is the host-side counterpart to the sandbox’s –bind <temp> /tmp: bash inside the sandbox writes to /tmp/foo, then the file tools (which run on the host, not in the sandbox) accept the same /tmp/foo path and resolve it to the workspace temp’s host path. Without this, the LLM would have to remember two paths for the same dir. Off by default; bin/pikuri-code flips it on when the bubblewrap sandbox is enabled.
Read-set vs. write-set
#resolve_for_read checks against readable writable+ (you can read anything writable). #resolve_for_write checks against writable only. Tools that mutate state route through the second method; tools that only inspect route through the first.
Existence is not the workspace’s concern
resolve_for_read(‘foo.rb’) succeeds (returns a Pathname) even if foo.rb doesn’t exist; the caller (Read) errors with file-not-found when it tries to open it. resolve_for_write tolerates entirely non-existent paths (Write can create lib/new/dir/foo.rb even when lib/new/ doesn’t exist) — the caller is responsible for any mkdir_p before writing. This split keeps the workspace narrowly responsible for containment, not for filesystem-state checks.
Subprocess environment (#env)
Workspaces own a #env Hash<String,String> that subprocess- spawning tools (currently Code::Bash) thread into Subprocess.spawn. The motivating case: the bubblewrap sandbox doesn’t bind-mount ~/.gitconfig, so git commit inside fails (modern git refuses rather than synthesizing a default from /etc/passwd — which isn’t bind-mounted either). The workspace resolves the host’s effective git identity at #project_root (so includeIf rules apply: a repo under ~/work/my/ that’s covered by a gitdir:~/work/my/ include gets that identity, not the global default) and exposes it as GIT_AUTHOR_* / GIT_COMMITTER_* env vars that override config entirely and need no file in the sandbox.
The lookup is lazy and memoized: the constructor doesn’t shell out, so building a workspace stays cheap (specs that never read #env pay nothing). First access runs git -C project_root config user.{name,email} once. Falls back to {} if git isn’t on PATH or no identity is configured — the subprocess then runs unmediated and git’s own “please tell me who you are” surfaces inside the sandbox.
Hosts that want a different env (extra vars, no git resolution, explicit identity) pass env: to the constructor; that value is used verbatim and the git lookup is skipped. Always frozen by the time it leaves the workspace.
Containment algorithm
#resolve walks up the input path to its deepest existing ancestor, realpath‘s that ancestor (resolving any symlinks in the existing portion), then verifies the resolved base matches one of the candidate roots. Four cases:
-
lib/foo.rb(exists) →existing= full path,basematches a root → returns the realpath’d file. -
lib/new/dir/foo.rb(intermediates missing) → walks up to the deepest existing parent inside a root → returns the intended new path (caller mkdir_ps the parent before writing). -
lib/../../etc/passwd(..escape) →cleanpathcollapses..syntactically, walks land outside every root → Error. -
link/foo.rbwhere link → /etc (symlink escape) → walks tolink(which exists),realpathresolves through the symlink to/etc, outside every root → Error.
Pure lexical normalization (cleanpath + prefix check) catches cases 1–3 but misses case 4. The walk-up realpath pass closes that gap.
Project-root denylist
Setting project_root to a system root or a user’s home directory is almost always a misconfiguration: it makes the entire system or home tree writable. The constructor rejects DENIED_PROJECT_ROOTS (system tops: /, /etc, /var, …) and any directory whose parent is /home (catches /home/$USER and /home/$OTHER_USER). This is a sanity guard against fat-fingering, not a security perimeter — the real security is the readable/writable lists. Other-OS home roots (+/Users/$USER+ on macOS) are not denied; Linux-first per CLAUDE.md.
Direct Known Subclasses
Defined Under Namespace
Constant Summary collapse
- CACHE_BASE =
Parent directory under which every workspace mints its umbrella (#internal_temp). Honors
XDG_CACHE_HOMEwhen set, else ~/.cache; thepikurisubdir is owned by us.mkdir_p‘d lazily on first umbrella access. File.join(ENV['XDG_CACHE_HOME'] || File.join(Dir.home, '.cache'), 'pikuri')
- INTERNAL_TEMP_STALE_SECONDS =
Umbrella dirs older than this are reaped by sweep_stale_internal_temps! at gem load. Generous enough that a long-lived pikuri session in another shell isn’t disturbed; tight enough that a process killed last week doesn’t leak forever.
7 * 24 * 60 * 60
- DENIED_PROJECT_ROOTS =
System-root project_roots the constructor refuses. Exact-match (not prefix) —
/home/user/projectpasses,/home/useris rejected by the parent-is-/home check below. Frozen list; downstream hosts with unusual layouts can subclass if they really need a different policy. %w[ / /etc /var /proc /sys /dev /boot /root /usr /opt /lib /lib64 /bin /sbin /tmp ].map { |p| Pathname.new(p) }.freeze
Instance Attribute Summary collapse
-
#alias_tmp_to_temp ⇒ Boolean
readonly
Whether #resolve_for_read / #resolve_for_write rewrite
/tmp/*inputs to #temp. -
#project_root ⇒ Pathname
readonly
Project root, post-realpath.
-
#readable ⇒ Array<Pathname>
readonly
Read-only roots (in addition to writable ones, which are also readable).
-
#temp ⇒ Pathname?
readonly
The LLM-visible scratch playground (writable, owned by this workspace) when constructed with temp: true, else
nil. -
#writable ⇒ Array<Pathname>
readonly
Writable roots (read+write).
Class Method Summary collapse
-
.mint_internal_temp ⇒ Object
FileUtils.remove_entryat_exitguards against the dir being already gone (test cleanup, manual rm). -
.sweep_stale_internal_temps! ⇒ void
Reap
workspace-*umbrella dirs that have outlived INTERNAL_TEMP_STALE_SECONDS.
Instance Method Summary collapse
-
#env ⇒ Hash{String=>String}
Environment variables for subprocesses spawned in this workspace.
-
#initialize(project_root:, readable: [], writable: [], temp: false, alias_tmp_to_temp: false, env: nil) ⇒ Filesystem
constructor
A new instance of Filesystem.
-
#internal_temp ⇒ Pathname
Per-workspace ephemeral umbrella.
-
#resolve_for_read(path) ⇒ Pathname
Resolve a user-supplied path against the read-set (readable ∪ writable).
-
#resolve_for_write(path) ⇒ Pathname
Resolve a user-supplied path against the write-set.
Constructor Details
#initialize(project_root:, readable: [], writable: [], temp: false, alias_tmp_to_temp: false, env: nil) ⇒ Filesystem
Returns a new instance of Filesystem.
248 249 250 251 252 253 254 255 256 257 258 259 |
# File 'lib/pikuri/workspace/filesystem.rb', line 248 def initialize(project_root:, readable: [], writable: [], temp: false, alias_tmp_to_temp: false, env: nil) @project_root = Pathname.new(project_root).realpath validate_project_root!(@project_root) @internal_temp = nil @temp = temp ? mint_playground : nil @alias_tmp_to_temp = alias_tmp_to_temp && !@temp.nil? @env_override = env @writable = ([@project_root] + writable.map { |p| Pathname.new(p).realpath } + [@temp].compact).uniq @readable = (@writable + readable.map { |p| Pathname.new(p).realpath }).uniq end |
Instance Attribute Details
#alias_tmp_to_temp ⇒ Boolean (readonly)
Returns whether #resolve_for_read / #resolve_for_write rewrite /tmp/* inputs to #temp.
263 264 265 |
# File 'lib/pikuri/workspace/filesystem.rb', line 263 def alias_tmp_to_temp @alias_tmp_to_temp end |
#project_root ⇒ Pathname (readonly)
204 205 206 |
# File 'lib/pikuri/workspace/filesystem.rb', line 204 def project_root @project_root end |
#readable ⇒ Array<Pathname> (readonly)
Returns read-only roots (in addition to writable ones, which are also readable). Post-realpath, deduped.
208 209 210 |
# File 'lib/pikuri/workspace/filesystem.rb', line 208 def readable @readable end |
#temp ⇒ Pathname? (readonly)
Returns the LLM-visible scratch playground (writable, owned by this workspace) when constructed with temp: true, else nil. Lives at <internal_temp>/playground; removed transitively when the umbrella is wiped on process exit.
219 220 221 |
# File 'lib/pikuri/workspace/filesystem.rb', line 219 def temp @temp end |
#writable ⇒ Array<Pathname> (readonly)
Returns writable roots (read+write). Includes #project_root and, if temp: true, #temp. Post-realpath, deduped.
213 214 215 |
# File 'lib/pikuri/workspace/filesystem.rb', line 213 def writable @writable end |
Class Method Details
.mint_internal_temp ⇒ Object
FileUtils.remove_entry at_exit guards against the dir being already gone (test cleanup, manual rm).
297 298 299 300 301 302 |
# File 'lib/pikuri/workspace/filesystem.rb', line 297 def self.mint_internal_temp FileUtils.mkdir_p(CACHE_BASE) path = Pathname.new(Dir.mktmpdir('workspace-', CACHE_BASE)).realpath at_exit { FileUtils.remove_entry(path.to_s) if path.exist? } path end |
.sweep_stale_internal_temps! ⇒ void
This method returns an undefined value.
Reap workspace-* umbrella dirs that have outlived INTERNAL_TEMP_STALE_SECONDS. Called once at gem load via Pikuri::Workspace so each process boot inherits a tidy CACHE_BASE. Failures (permission denied, racing concurrent sweeper) are swallowed — best-effort cleanup, the real at_exit path is the load-bearing one.
312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 |
# File 'lib/pikuri/workspace/filesystem.rb', line 312 def self.sweep_stale_internal_temps! return unless File.directory?(CACHE_BASE) cutoff = Time.now - INTERNAL_TEMP_STALE_SECONDS Dir.children(CACHE_BASE).each do |entry| next unless entry.start_with?('workspace-') path = File.join(CACHE_BASE, entry) next unless File.directory?(path) next if File.mtime(path) > cutoff FileUtils.remove_entry(path) rescue StandardError # best-effort sweep; never block the host on dead state end end |
Instance Method Details
#env ⇒ Hash{String=>String}
Environment variables for subprocesses spawned in this workspace. Lazy, memoized, frozen.
When the constructor received env: nil (the default), the first call here runs git -C project_root config user.name + user.email once and returns GIT_AUTHOR_* / GIT_COMMITTER_* accordingly. When the constructor received an explicit hash, this returns that hash (frozen) and never shells out. Returns {} if git resolution finds no identity for #project_root or git isn’t on PATH.
277 278 279 |
# File 'lib/pikuri/workspace/filesystem.rb', line 277 def env @env ||= (@env_override || compute_git_identity_env).freeze end |
#internal_temp ⇒ Pathname
Per-workspace ephemeral umbrella. Minted lazily on first call under CACHE_BASE. Registered for at_exit removal the moment it’s minted, so anything subsequently placed inside (the playground, Code::Bash::Sandbox::Bubblewrap‘s overlay state) gets wiped together. Callers that want ephemeral state owned by the workspace should put it under this dir rather than minting their own siblings.
290 291 292 |
# File 'lib/pikuri/workspace/filesystem.rb', line 290 def internal_temp @internal_temp ||= Filesystem.mint_internal_temp end |
#resolve_for_read(path) ⇒ Pathname
Resolve a user-supplied path against the read-set (readable ∪writable). Returned Pathname is absolute and may not exist on disk; the caller validates existence separately.
335 336 337 |
# File 'lib/pikuri/workspace/filesystem.rb', line 335 def resolve_for_read(path) resolve(path, @readable) end |
#resolve_for_write(path) ⇒ Pathname
Resolve a user-supplied path against the write-set.
344 345 346 |
# File 'lib/pikuri/workspace/filesystem.rb', line 344 def resolve_for_write(path) resolve(path, @writable) end |