Class: Pikuri::Workspace::Filesystem

Inherits:
Object
  • Object
show all
Defined in:
lib/pikuri/workspace/filesystem.rb

Overview

Defines which paths the agent can see and write to. Constructed with explicit readable / writable prefix lists; every Read/Write/Edit/ Grep/Glob/Bash path the agent supplies is checked against those lists before touching the filesystem. Returned Pathnames are absolute, post-symlink-resolution.

Project root, readable, writable

project_root is the writable containment ceiling — automatically folded into both readable and writable; you can read and write anywhere under the project unconditionally. It is also the base for resolving relative paths supplied by the LLM, and the chdir target tools like Bash / Grep / Glob pass to the subprocess. There is deliberately no separate “cwd” concept: hosts that want the agent to operate inside a specific subtree pass that subtree as project_root. bin/pikuri-code enforces this by Dir.chdir‘ing the Ruby process to the discovered project root at startup if it differs from the launch Dir.pwd, so the workspace and the surrounding process agree on one anchor.

The extra readable list grants read-only access to additional roots (system toolchains, dependency caches, skill catalogs); the extra writable list grants read+write to additional roots (other project directories the agent should be able to touch).

Session umbrella (#internal_temp)

Every workspace owns a per-process umbrella dir at ~/.cache/pikuri/workspace-XXX/ (#internal_temp). It is minted lazily on first access — workspaces that never touch it (most specs, hosts that don’t want a playground and don’t use the bubblewrap overlay) pay nothing — and removed by a single at_exit handler when the process exits. Everything ephemeral this workspace produces lives inside the umbrella, so one remove_entry at process exit cleans the lot:

  • #temp — the LLM-visible playground subdir, present only when temp: true (see below).

  • The bubblewrap sandbox’s per-toolchain overlay state (+overlay-<slug>/upper,work+ for ~/.gradle/caches, ~/.m2/repository, …), used to keep cross-project toolchain caches isolated to one pikuri-code session. See Code::Bash::Sandbox::Bubblewrap.

The umbrella deliberately lives in ~/.cache/pikuri rather than /tmp: the Code::Bash::Sandbox::Bubblewrap sandbox binds #temp at /tmp inside the sandbox (so the LLM’s reflexive /tmp writes persist across bash calls). With the umbrella already under /tmp, that bind would land on top of itself and per-call mountpoint creation would pollute the dir recursively. ~/.cache/pikuri avoids the collision.

At gem load, Filesystem.sweep_stale_internal_temps! prunes umbrella dirs older than seven days — a safety net for sessions that died before at_exit could run (SIGKILL, OOM). Recent umbrellas are left alone so a concurrent pikuri-code in another shell isn’t disturbed.

Optional temp playground

When constructed with temp: true, the workspace adds <internal_temp>/playground to writable and exposes it via #temp. The binary advertises this path to the LLM (e.g. in the system prompt) as scratch space. Default is false: specs and tests that build many workspaces don’t pay the mkdir cost, and hosts that don’t need a playground don’t get one. Either way, the umbrella is shared with everything else that wants ephemeral state — no second tempdir is minted.

Optional /tmp alias

When alias_tmp_to_temp: true AND temp: is set, file paths supplied to #resolve_for_read / #resolve_for_write that start with /tmp/ (or are exactly /tmp) are rewritten to the host #temp path before containment is checked. This is the host-side counterpart to the sandbox’s –bind <temp> /tmp: bash inside the sandbox writes to /tmp/foo, then the file tools (which run on the host, not in the sandbox) accept the same /tmp/foo path and resolve it to the workspace temp’s host path. Without this, the LLM would have to remember two paths for the same dir. Off by default; bin/pikuri-code flips it on when the bubblewrap sandbox is enabled.

Read-set vs. write-set

#resolve_for_read checks against readable writable+ (you can read anything writable). #resolve_for_write checks against writable only. Tools that mutate state route through the second method; tools that only inspect route through the first.

Existence is not the workspace’s concern

resolve_for_read(‘foo.rb’) succeeds (returns a Pathname) even if foo.rb doesn’t exist; the caller (Read) errors with file-not-found when it tries to open it. resolve_for_write tolerates entirely non-existent paths (Write can create lib/new/dir/foo.rb even when lib/new/ doesn’t exist) — the caller is responsible for any mkdir_p before writing. This split keeps the workspace narrowly responsible for containment, not for filesystem-state checks.

Subprocess environment (#env)

Workspaces own a #env Hash<String,String> that subprocess- spawning tools (currently Code::Bash) thread into Subprocess.spawn. The motivating case: the bubblewrap sandbox doesn’t bind-mount ~/.gitconfig, so git commit inside fails (modern git refuses rather than synthesizing a default from /etc/passwd — which isn’t bind-mounted either). The workspace resolves the host’s effective git identity at #project_root (so includeIf rules apply: a repo under ~/work/my/ that’s covered by a gitdir:~/work/my/ include gets that identity, not the global default) and exposes it as GIT_AUTHOR_* / GIT_COMMITTER_* env vars that override config entirely and need no file in the sandbox.

The lookup is lazy and memoized: the constructor doesn’t shell out, so building a workspace stays cheap (specs that never read #env pay nothing). First access runs git -C project_root config user.{name,email} once. Falls back to {} if git isn’t on PATH or no identity is configured — the subprocess then runs unmediated and git’s own “please tell me who you are” surfaces inside the sandbox.

Hosts that want a different env (extra vars, no git resolution, explicit identity) pass env: to the constructor; that value is used verbatim and the git lookup is skipped. Always frozen by the time it leaves the workspace.

Containment algorithm

#resolve walks up the input path to its deepest existing ancestor, realpath‘s that ancestor (resolving any symlinks in the existing portion), then verifies the resolved base matches one of the candidate roots. Four cases:

  1. lib/foo.rb (exists) → existing = full path, base matches a root → returns the realpath’d file.

  2. lib/new/dir/foo.rb (intermediates missing) → walks up to the deepest existing parent inside a root → returns the intended new path (caller mkdir_ps the parent before writing).

  3. lib/../../etc/passwd (.. escape) → cleanpath collapses .. syntactically, walks land outside every root → Error.

  4. link/foo.rb where link → /etc (symlink escape) → walks to link (which exists), realpath resolves through the symlink to /etc, outside every root → Error.

Pure lexical normalization (cleanpath + prefix check) catches cases 1–3 but misses case 4. The walk-up realpath pass closes that gap.

Project-root denylist

Setting project_root to a system root or a user’s home directory is almost always a misconfiguration: it makes the entire system or home tree writable. The constructor rejects DENIED_PROJECT_ROOTS (system tops: /, /etc, /var, …) and any directory whose parent is /home (catches /home/$USER and /home/$OTHER_USER). This is a sanity guard against fat-fingering, not a security perimeter — the real security is the readable/writable lists. Other-OS home roots (+/Users/$USER+ on macOS) are not denied; Linux-first per CLAUDE.md.

Direct Known Subclasses

AllowAll

Defined Under Namespace

Classes: AllowAll, Error

Constant Summary collapse

CACHE_BASE =

Parent directory under which every workspace mints its umbrella (#internal_temp). Honors XDG_CACHE_HOME when set, else ~/.cache; the pikuri subdir is owned by us. mkdir_p‘d lazily on first umbrella access.

File.join(ENV['XDG_CACHE_HOME'] || File.join(Dir.home, '.cache'), 'pikuri')
INTERNAL_TEMP_STALE_SECONDS =

Umbrella dirs older than this are reaped by sweep_stale_internal_temps! at gem load. Generous enough that a long-lived pikuri session in another shell isn’t disturbed; tight enough that a process killed last week doesn’t leak forever.

7 * 24 * 60 * 60
DENIED_PROJECT_ROOTS =

System-root project_roots the constructor refuses. Exact-match (not prefix) — /home/user/project passes, /home/user is rejected by the parent-is-/home check below. Frozen list; downstream hosts with unusual layouts can subclass if they really need a different policy.

%w[
  / /etc /var /proc /sys /dev /boot /root
  /usr /opt /lib /lib64 /bin /sbin /tmp
].map { |p| Pathname.new(p) }.freeze

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(project_root:, readable: [], writable: [], temp: false, alias_tmp_to_temp: false, env: nil) ⇒ Filesystem

Returns a new instance of Filesystem.

Parameters:

  • project_root (String, Pathname)

    absolute (or working- directory-relative) path to the project root. realpath‘d once; must exist; must not match DENIED_PROJECT_ROOTS or be a direct child of /home.

  • readable (Array<String, Pathname>) (defaults to: [])

    additional read-only roots. realpath‘d at construction; missing entries raise loudly via Pathname#realpath.

  • writable (Array<String, Pathname>) (defaults to: [])

    additional read+write roots. Same treatment as readable.

  • temp (Boolean) (defaults to: false)

    when true, adds <internal_temp>/playground to #writable and exposes it via #temp. Forces the umbrella to mint up-front (the playground is created eagerly so #writable reflects it).

  • alias_tmp_to_temp (Boolean) (defaults to: false)

    when true AND temp: is set, /tmp/* paths supplied to #resolve_for_read / #resolve_for_write are rewritten to point at #temp. Pairs with the bubblewrap sandbox’s –bind <temp> /tmp.

  • env (Hash{String=>String}, nil) (defaults to: nil)

    subprocess environment exposed via #env. nil (default) → lazy-derive the host git identity from #project_root on first access; explicit hash → use verbatim (and skip the git lookup entirely). See the class header §“Subprocess environment” for the rationale.

Raises:

  • (Errno::ENOENT)

    if project_root or any readable/writable entry does not exist.

  • (Error)

    if project_root is denied (system root or /home/*).



248
249
250
251
252
253
254
255
256
257
258
259
# File 'lib/pikuri/workspace/filesystem.rb', line 248

def initialize(project_root:, readable: [], writable: [], temp: false, alias_tmp_to_temp: false, env: nil)
  @project_root = Pathname.new(project_root).realpath
  validate_project_root!(@project_root)

  @internal_temp = nil
  @temp = temp ? mint_playground : nil
  @alias_tmp_to_temp = alias_tmp_to_temp && !@temp.nil?
  @env_override = env

  @writable = ([@project_root] + writable.map { |p| Pathname.new(p).realpath } + [@temp].compact).uniq
  @readable = (@writable + readable.map { |p| Pathname.new(p).realpath }).uniq
end

Instance Attribute Details

#alias_tmp_to_tempBoolean (readonly)

Returns whether #resolve_for_read / #resolve_for_write rewrite /tmp/* inputs to #temp.

Returns:



263
264
265
# File 'lib/pikuri/workspace/filesystem.rb', line 263

def alias_tmp_to_temp
  @alias_tmp_to_temp
end

#project_rootPathname (readonly)

Returns project root, post-realpath. The writable containment ceiling, the base for relative-path resolution, and the chdir target for Bash/Grep/Glob — always in #readable and #writable.

Returns:

  • (Pathname)

    project root, post-realpath. The writable containment ceiling, the base for relative-path resolution, and the chdir target for Bash/Grep/Glob — always in #readable and #writable.



204
205
206
# File 'lib/pikuri/workspace/filesystem.rb', line 204

def project_root
  @project_root
end

#readableArray<Pathname> (readonly)

Returns read-only roots (in addition to writable ones, which are also readable). Post-realpath, deduped.

Returns:

  • (Array<Pathname>)

    read-only roots (in addition to writable ones, which are also readable). Post-realpath, deduped.



208
209
210
# File 'lib/pikuri/workspace/filesystem.rb', line 208

def readable
  @readable
end

#tempPathname? (readonly)

Returns the LLM-visible scratch playground (writable, owned by this workspace) when constructed with temp: true, else nil. Lives at <internal_temp>/playground; removed transitively when the umbrella is wiped on process exit.

Returns:

  • (Pathname, nil)

    the LLM-visible scratch playground (writable, owned by this workspace) when constructed with temp: true, else nil. Lives at <internal_temp>/playground; removed transitively when the umbrella is wiped on process exit.



219
220
221
# File 'lib/pikuri/workspace/filesystem.rb', line 219

def temp
  @temp
end

#writableArray<Pathname> (readonly)

Returns writable roots (read+write). Includes #project_root and, if temp: true, #temp. Post-realpath, deduped.

Returns:

  • (Array<Pathname>)

    writable roots (read+write). Includes #project_root and, if temp: true, #temp. Post-realpath, deduped.



213
214
215
# File 'lib/pikuri/workspace/filesystem.rb', line 213

def writable
  @writable
end

Class Method Details

.mint_internal_tempObject

FileUtils.remove_entry at_exit guards against the dir being already gone (test cleanup, manual rm).



297
298
299
300
301
302
# File 'lib/pikuri/workspace/filesystem.rb', line 297

def self.mint_internal_temp
  FileUtils.mkdir_p(CACHE_BASE)
  path = Pathname.new(Dir.mktmpdir('workspace-', CACHE_BASE)).realpath
  at_exit { FileUtils.remove_entry(path.to_s) if path.exist? }
  path
end

.sweep_stale_internal_temps!void

This method returns an undefined value.

Reap workspace-* umbrella dirs that have outlived INTERNAL_TEMP_STALE_SECONDS. Called once at gem load via Pikuri::Workspace so each process boot inherits a tidy CACHE_BASE. Failures (permission denied, racing concurrent sweeper) are swallowed — best-effort cleanup, the real at_exit path is the load-bearing one.



312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
# File 'lib/pikuri/workspace/filesystem.rb', line 312

def self.sweep_stale_internal_temps!
  return unless File.directory?(CACHE_BASE)

  cutoff = Time.now - INTERNAL_TEMP_STALE_SECONDS
  Dir.children(CACHE_BASE).each do |entry|
    next unless entry.start_with?('workspace-')
    path = File.join(CACHE_BASE, entry)
    next unless File.directory?(path)
    next if File.mtime(path) > cutoff

    FileUtils.remove_entry(path)
  rescue StandardError
    # best-effort sweep; never block the host on dead state
  end
end

Instance Method Details

#envHash{String=>String}

Environment variables for subprocesses spawned in this workspace. Lazy, memoized, frozen.

When the constructor received env: nil (the default), the first call here runs git -C project_root config user.name + user.email once and returns GIT_AUTHOR_* / GIT_COMMITTER_* accordingly. When the constructor received an explicit hash, this returns that hash (frozen) and never shells out. Returns {} if git resolution finds no identity for #project_root or git isn’t on PATH.

Returns:

  • (Hash{String=>String})


277
278
279
# File 'lib/pikuri/workspace/filesystem.rb', line 277

def env
  @env ||= (@env_override || compute_git_identity_env).freeze
end

#internal_tempPathname

Per-workspace ephemeral umbrella. Minted lazily on first call under CACHE_BASE. Registered for at_exit removal the moment it’s minted, so anything subsequently placed inside (the playground, Code::Bash::Sandbox::Bubblewrap‘s overlay state) gets wiped together. Callers that want ephemeral state owned by the workspace should put it under this dir rather than minting their own siblings.

Returns:

  • (Pathname)


290
291
292
# File 'lib/pikuri/workspace/filesystem.rb', line 290

def internal_temp
  @internal_temp ||= Filesystem.mint_internal_temp
end

#resolve_for_read(path) ⇒ Pathname

Resolve a user-supplied path against the read-set (readable ∪writable). Returned Pathname is absolute and may not exist on disk; the caller validates existence separately.

Parameters:

  • path (String)

Returns:

  • (Pathname)

Raises:

  • (Error)

    if the resolved path falls outside every root



335
336
337
# File 'lib/pikuri/workspace/filesystem.rb', line 335

def resolve_for_read(path)
  resolve(path, @readable)
end

#resolve_for_write(path) ⇒ Pathname

Resolve a user-supplied path against the write-set.

Parameters:

  • path (String)

Returns:

  • (Pathname)

Raises:

  • (Error)

    if the resolved path falls outside every writable root



344
345
346
# File 'lib/pikuri/workspace/filesystem.rb', line 344

def resolve_for_write(path)
  resolve(path, @writable)
end