Class: Pikuri::Code::Bash::Sandbox::Bubblewrap
- Inherits:
-
Object
- Object
- Pikuri::Code::Bash::Sandbox::Bubblewrap
- Defined in:
- lib/pikuri/code/bash/sandbox.rb
Overview
Bubblewrap (bwrap(1)) sandbox: composes a bwrap argv from the supplied Workspace plus a curated OS-runtime baseline, so the bash subprocess sees only the project + toolchain + ephemeral temp + the few /etc files needed for TLS, DNS, timezone, and name resolution.
What’s bound, and why
-
SYSTEM_ROOTS —
/lib,/lib64,/bin,/sbin(often symlinks to/usron modern distros). Not in Workspace#readable (the LLM has no business grepping/sbin/), but the subprocess needs them executable for the dynamic linker + standard utilities./usrand/optare not listed here because they already come in via Workspace#readable (added byPikuri::Code::ToolchainPaths.readable). -
ETC_BASELINE —
/etc/ssl,/etc/ca-certificates,/etc/pki,/etc/resolv.conf,/etc/nsswitch.conf,/etc/localtime,/etc/hosts. Allowlist (not the whole/etc!) of the filesbashsubprocesses commonly need —TLS handshake, DNS, timezone, hostname resolution. Nothing sensitive (noshadow, nossh_config, no NetworkManager state). -
/tmp— when Workspace::Filesystem#temp is set, bound to the workspace temp dir (so the LLM’s reflexive/tmpwrites land in a persistent dir that survives between bash calls). When no workspace temp is wired in, falls back to –tmpfs /tmp (per-call ephemeral). The host’s/tmpis never exposed./proc(synthetic, sees only the sandbox’s own processes due to--unshare-pid) and/dev(synthetic,null/zero/random/ttyonly) round out the synthetic mounts. -
workspace.readable→--ro-bindeach path at the same path in the sandbox, EXCEPT paths that also appear inephemeral_overlay:(see below). -
workspace.writable→--bind(read+write) each path. The workspace temp’s host path (under ~/.cache/pikuri, not under/tmp) is bound at its host path too — so the same dir is reachable via both/tmp(LLM reflex) and the host path (advertised by the system prompt, used consistently by the file tools off the host filesystem). -
ephemeral_overlay— per-user dependency caches the toolchain mutates (+~/.gradle/caches+, ~/.m2/repository, ~/.cargo/registry, …). Each path is mounted as a bubblewrap overlay: the host’s real dir is the lower (read-through), and a per-session upper + workdir under <workspace.internal_temp>/overlay-<slug>/ absorb writes. Result: gradle/maven/cargo see a fully read-write view of their cache, the host’s real cache is untouched, and on process exit the umbrella (and with it every upper layer) is removed by the workspace’s singleat_exit. Within one pikuri-code session writes survive across bash calls (warm cache after the first build); across sessions they don’t (so a session that gets prompt-injected into poisoning the in-sandbox view of gradle’s cache cannot propagate the damage to the host’s normalgradleinvocations or to a future pikuri-code session). Note: the overlay paths are deliberately narrow subdirs (e.g. ~/.gradle/caches, not ~/.gradle) sogradle.properties/init.d/.credentialsnever reach the sandbox at all — see ToolchainPaths for the credential / persistence exclusion rationale.
Concurrency contract
Each Bubblewrap instance must own its upper/workdir paths exclusively — overlayfs returns EBUSY when two live mounts share an upper or workdir. The bundled wiring guarantees this:
-
One Workspace::Filesystem mints one umbrella (Workspace::Filesystem#internal_temp).
-
One umbrella feeds one Bubblewrap, which derives its per-path overlay-<slug>/ subdirs from that umbrella.
-
Pikuri::Code::Bash runs bash -c synchronously (Subprocess#wait), and sub-agents block their parent’s loop while running (the
agenttool frompikuri-subagentsruns its child’s loop synchronously in itsexecuteclosure), so twobwrapinvocations spawned by the same pikuri process never overlap in time.
Two concurrent pikuri-code processes are independent — each mints its own umbrella, each gets its own overlay-<slug>/ tree, the host’s real cache (the shared lower layer) is read-only and per kernel docs may be shared across overlay mounts without restriction. A downstream host that builds something fan-out-y (e.g. N parallel shell tasks reusing one Bubblewrap) would collide on its own; pikuri itself doesn’t.
What the overlay does NOT defend
Bubblewrap as a whole is *blast-radius containment* for the bash subprocess, not a malware-resistant boundary. Prompt injection that reaches the LLM can still:
-
Modify project source under
project_root(the LLM legitimately needs Write access there — overlay isn’t an option without breaking the agent). -
Inject a malicious dependency in the project’s
build.gradle.kts/pom.xml/package.json, which the next build will execute. -
Exfiltrate over the network —
--share-netis intentional so git pull /mvn/ gem install /curlwork.
The overlay specifically prevents cross-project contamination via shared $HOME caches. Users who need adversarial isolation run pikuri-code inside a container / devcontainer; the container is the outer boundary, the bwrap sandbox is the inner one. See CLAUDE.md “Scope decisions” / “Workspace seam” and the matching note on Filesystem::AllowAll.
Isolation
–unshare-all –share-net: PID, mount, IPC, user, and UTS namespaces are unshared (the sandbox can’t see host processes, can’t mount on the host, can’t ptrace, …); the network namespace is kept shared because the agent’s bash routinely needs git pull, mvn, gem install, curl, etc. –die-with-parent –new-session: subprocess dies with pikuri, in its own session group (no terminal control bleed).
Failures that surface at construction
The constructor probes the workspace shape, then bwrap with a no-op invocation. Four cases raise loudly:
-
Workspace lists
/as writable (typically Workspace::Filesystem::AllowAll) — Bubblewrap exists for filesystem containment, which is structurally meaningless when the whole filesystem is the workspace. The host should pass NONE instead. -
Workspace has
tempbutalias_tmp_to_tempis off —inconsistent setup: this sandbox would bindworkspace.tempat/tmpinside the subprocess (so the LLM’s reflexive/tmpwrites persist), but file tools running on the host would still reject/tmp/fooas outside the workspace. The LLM would write via bash and then fail to read via the file tools; fail at construction instead of letting that trap fire mid-conversation. -
bwrapnot onPATH→Errno::ENOENTwrapped asRuntimeError. -
Kernel lacks user-namespace support (some hardened distros) →
bwrapexits non-zero, surfaced asRuntimeError.
Either way the binary should fail at boot, not on the first bash tool call — matches the “errors are loud” convention. The host opts out of sandboxing via --no-sandbox / --yolo.
Constant Summary collapse
- BWRAP_BINARY =
'bwrap'- SYSTEM_ROOTS =
System-root dirs the subprocess needs that aren’t in Workspace#readable. Each is
--ro-bind‘d if it exists on the host; missing entries are skipped silently (older or unusual layouts). %w[/lib /lib64 /bin /sbin].freeze
- ETC_BASELINE =
/etcfile allowlist for the subprocess. Each is--ro-bind‘d if it exists on the host. Nothing else from/etcis exposed — noshadow, nopasswdbeyond what/etc/hoststouches, no SSH config, no NetworkManager state. %w[ /etc/ssl /etc/ca-certificates /etc/pki /etc/resolv.conf /etc/nsswitch.conf /etc/localtime /etc/hosts ].freeze
- DENIED_CONTAINER_SOCKETS =
Container / VM control sockets that, if reachable from inside the sandbox, give the bash subprocess a one-step path to root-equivalent host access. The Docker daemon cheerfully honors docker run –privileged -v / /host, so exposing
/var/run/docker.sockto a sandboxed agent effectively undoes the sandbox. Same story for containerd, CRI-O, podman (rootful), buildkit, libvirt, LXD.The pikuri default workspace doesn’t expose
/varor/runat all (none of SYSTEM_ROOTS, ETC_BASELINE, or ToolchainPaths.readable touches them), so these sockets are unreachable by default. #reject_container_socket_exposure! guards the configuration surface — a downstream binary adding the docker socket toworkspace.writable“so the agent can run docker build” would unknowingly hand the LLM the keys, and we’d rather fail loud at construction.Rootless variants under $XDG_RUNTIME_DIR / /run/user/$UID/ are computed at class-load time. The list is not exhaustive; it covers the engines most likely to be installed on a Linux dev box. A downstream host with an unusual setup can subclass and extend.
begin xdg_runtime = ENV['XDG_RUNTIME_DIR'] || "/run/user/#{Process.uid}" paths = %w[ /var/run/docker.sock /run/docker.sock /var/run/containerd/containerd.sock /run/containerd/containerd.sock /var/run/crio/crio.sock /run/crio/crio.sock /run/podman/podman.sock /var/run/podman/podman.sock /run/buildkit/buildkitd.sock /var/run/buildkit/buildkitd.sock /var/run/libvirt/libvirt-sock /run/libvirt/libvirt-sock /var/lib/lxd/unix.socket /var/snap/lxd/common/lxd/unix.socket ] paths.concat([ "#{xdg_runtime}/docker.sock", "#{xdg_runtime}/podman/podman.sock" ]) paths.map { |p| Pathname.new(p) }.uniq.freeze end
Instance Method Summary collapse
-
#initialize(workspace:, ephemeral_overlay: []) ⇒ Bubblewrap
constructor
A new instance of Bubblewrap.
-
#wrap(argv) ⇒ Array<String>
bwrap+ isolation flags + bind-mounts +argv, ready to hand to Subprocess.spawn.
Constructor Details
#initialize(workspace:, ephemeral_overlay: []) ⇒ Bubblewrap
Returns a new instance of Bubblewrap.
299 300 301 302 303 304 305 306 307 |
# File 'lib/pikuri/code/bash/sandbox.rb', line 299 def initialize(workspace:, ephemeral_overlay: []) @workspace = workspace @ephemeral_overlay = .map { |p| Pathname.new(p).realpath }.uniq reject_unbounded_workspace! reject_unaliased_temp! reject_container_socket_exposure! check_bwrap! end |
Instance Method Details
#wrap(argv) ⇒ Array<String>
Returns bwrap + isolation flags + bind-mounts + argv, ready to hand to Subprocess.spawn.
314 315 316 |
# File 'lib/pikuri/code/bash/sandbox.rb', line 314 def wrap(argv) [BWRAP_BINARY, *bwrap_args, *argv] end |