Class: Pikuri::Code::Bash::Sandbox::Bubblewrap
- Inherits:
-
Object
- Object
- Pikuri::Code::Bash::Sandbox::Bubblewrap
- Defined in:
- lib/pikuri/code/bash/sandbox.rb
Overview
Bubblewrap (bwrap(1)) sandbox: composes a bwrap argv from the supplied Workspace plus a curated OS-runtime baseline, so the bash subprocess sees only the project + toolchain + ephemeral temp + the few /etc files needed for TLS, DNS, timezone, and name resolution.
What’s bound, and why
-
SYSTEM_ROOTS —
/lib,/lib64,/bin,/sbin(often symlinks to/usron modern distros). Not in Workspace#readable (the LLM has no business grepping/sbin/), but the subprocess needs them executable for the dynamic linker + standard utilities./usrand/optare not listed here because they already come in via Workspace#readable (added byPikuri::Code::ToolchainPaths.readable). -
ETC_BASELINE —
/etc/ssl,/etc/ca-certificates,/etc/pki,/etc/resolv.conf,/etc/nsswitch.conf,/etc/localtime,/etc/hosts. Allowlist (not the whole/etc!) of the filesbashsubprocesses commonly need —TLS handshake, DNS, timezone, hostname resolution. Nothing sensitive (noshadow, nossh_config, no NetworkManager state). -
/tmp— when Workspace::Filesystem#temp is set, bound to the workspace temp dir (so the LLM’s reflexive/tmpwrites land in a persistent dir that survives between bash calls). When no workspace temp is wired in, falls back to –tmpfs /tmp (per-call ephemeral). The host’s/tmpis never exposed./proc(synthetic, sees only the sandbox’s own processes due to--unshare-pid) and/dev(synthetic,null/zero/random/ttyonly) round out the synthetic mounts. -
workspace.readable→--ro-bindeach path at the same path in the sandbox, EXCEPT paths that also appear inephemeral_overlay:(see below). -
workspace.writable→--bind(read+write) each path. The workspace temp’s host path (under ~/.cache/pikuri, not under/tmp) is bound at its host path too — so the same dir is reachable via both/tmp(LLM reflex) and the host path (advertised by the system prompt, used consistently by the file tools off the host filesystem). -
ephemeral_overlay— per-user dependency caches the toolchain mutates (+~/.gradle/caches+, ~/.m2/repository, ~/.cargo/registry, …). Each path is mounted as a bubblewrap overlay: the host’s real dir is the lower (read-through), and a per-session upper + workdir under <workspace.internal_temp>/overlay-<slug>/ absorb writes. Result: gradle/maven/cargo see a fully read-write view of their cache, the host’s real cache is untouched, and on process exit the umbrella (and with it every upper layer) is removed by the workspace’s Finalizers registration. Within one pikuri-code session writes survive across bash calls (warm cache after the first build); across sessions they don’t (so a session that gets prompt-injected into poisoning the in-sandbox view of gradle’s cache cannot propagate the damage to the host’s normalgradleinvocations or to a future pikuri-code session). Note: the overlay paths are deliberately narrow subdirs (e.g. ~/.gradle/caches, not ~/.gradle) sogradle.properties/init.d/.credentialsnever reach the sandbox at all — see ToolchainPaths for the credential / persistence exclusion rationale.
Concurrency contract
Each Bubblewrap instance must own its upper/workdir paths exclusively — overlayfs returns EBUSY when two live mounts share an upper or workdir. The bundled wiring guarantees this:
-
One Workspace::Filesystem mints one umbrella (Workspace::Filesystem#internal_temp).
-
One umbrella feeds one Bubblewrap, which derives its per-path overlay-<slug>/ subdirs from that umbrella.
-
Pikuri::Code::Bash runs bash -c synchronously (Subprocess#wait), and sub-agents block their parent’s loop while running (the
agenttool frompikuri-subagentsruns its child’s loop synchronously in itsexecuteclosure), so twobwrapinvocations spawned by the same pikuri process never overlap in time.
Two concurrent pikuri-code processes are independent — each mints its own umbrella, each gets its own overlay-<slug>/ tree, the host’s real cache (the shared lower layer) is read-only and per kernel docs may be shared across overlay mounts without restriction. A downstream host that builds something fan-out-y (e.g. N parallel shell tasks reusing one Bubblewrap) would collide on its own; pikuri itself doesn’t.
What the overlay does NOT defend
Bubblewrap as a whole is *blast-radius containment* for the bash subprocess, not a malware-resistant boundary. Prompt injection that reaches the LLM can still:
-
Modify project source under
project_root(the LLM legitimately needs Write access there — overlay isn’t an option without breaking the agent). -
Inject a malicious dependency in the project’s
build.gradle.kts/pom.xml/package.json, which the next build will execute. -
Exfiltrate over the network —
--share-netis intentional so git pull /mvn/ gem install /curlwork.
The overlay specifically prevents cross-project contamination via shared $HOME caches. Users who need adversarial isolation run pikuri-code inside a container / devcontainer; the container is the outer boundary, the bwrap sandbox is the inner one. See CLAUDE.md “Scope decisions” / “Workspace seam” and the matching note on Filesystem::AllowAll.
Isolation
–unshare-all –share-net: PID, mount, IPC, user, and UTS namespaces are unshared (the sandbox can’t see host processes, can’t mount on the host, can’t ptrace, …); the network namespace is kept shared because the agent’s bash routinely needs git pull, mvn, gem install, curl, etc. –die-with-parent –new-session: subprocess dies with pikuri, in its own session group (no terminal control bleed).
Failures that surface at construction
The constructor probes the workspace shape, then bwrap with a no-op invocation. Four cases raise loudly:
-
Workspace lists
/as writable (typically Workspace::Filesystem::AllowAll) — Bubblewrap exists for filesystem containment, which is structurally meaningless when the whole filesystem is the workspace. The host should pass NONE instead. -
Workspace has
tempbutalias_tmp_to_tempis off —inconsistent setup: this sandbox would bindworkspace.tempat/tmpinside the subprocess (so the LLM’s reflexive/tmpwrites persist), but file tools running on the host would still reject/tmp/fooas outside the workspace. The LLM would write via bash and then fail to read via the file tools; fail at construction instead of letting that trap fire mid-conversation. -
bwrapnot onPATH→Errno::ENOENTwrapped asRuntimeError. -
Kernel lacks user-namespace support (some hardened distros) →
bwrapexits non-zero, surfaced asRuntimeError.
Either way the binary should fail at boot, not on the first bash tool call — matches the “errors are loud” convention. The host opts out of sandboxing via --no-sandbox / --yolo.
Constant Summary collapse
- BWRAP_BINARY =
'bwrap'- SYSTEM_ROOTS =
System-root dirs the subprocess needs that aren’t in Workspace#readable. Each is
--ro-bind‘d if it exists on the host; missing entries are skipped silently (older or unusual layouts). %w[/lib /lib64 /bin /sbin].freeze
- ETC_BASELINE =
/etcfile allowlist for the subprocess. Each is--ro-bind‘d if it exists on the host. Nothing else from/etcis exposed — noshadow, nopasswdbeyond what/etc/hoststouches, no SSH config, no NetworkManager state. %w[ /etc/ssl /etc/ca-certificates /etc/pki /etc/resolv.conf /etc/nsswitch.conf /etc/localtime /etc/hosts ].freeze
- DENIED_CONTAINER_SOCKETS =
Container / VM control sockets that, if reachable from inside the sandbox, give the bash subprocess a one-step path to root-equivalent host access. The Docker daemon cheerfully honors docker run –privileged -v / /host, so exposing
/var/run/docker.sockto a sandboxed agent effectively undoes the sandbox. Same story for containerd, CRI-O, podman (rootful), buildkit, libvirt, LXD.The pikuri default workspace doesn’t expose
/varor/runat all (none of SYSTEM_ROOTS, ETC_BASELINE, or ToolchainPaths.readable touches them), so these sockets are unreachable by default. #reject_container_socket_exposure! guards the configuration surface — a downstream binary adding the docker socket toworkspace.writable“so the agent can run docker build” would unknowingly hand the LLM the keys, and we’d rather fail loud at construction.Rootless variants under $XDG_RUNTIME_DIR / /run/user/$UID/ are computed at class-load time. The list is not exhaustive; it covers the engines most likely to be installed on a Linux dev box. A downstream host with an unusual setup can subclass and extend.
begin xdg_runtime = ENV['XDG_RUNTIME_DIR'] || "/run/user/#{Process.uid}" paths = %w[ /var/run/docker.sock /run/docker.sock /var/run/containerd/containerd.sock /run/containerd/containerd.sock /var/run/crio/crio.sock /run/crio/crio.sock /run/podman/podman.sock /var/run/podman/podman.sock /run/buildkit/buildkitd.sock /var/run/buildkit/buildkitd.sock /var/run/libvirt/libvirt-sock /run/libvirt/libvirt-sock /var/lib/lxd/unix.socket /var/snap/lxd/common/lxd/unix.socket ] paths.concat([ "#{xdg_runtime}/docker.sock", "#{xdg_runtime}/podman/podman.sock" ]) paths.map { |p| Pathname.new(p) }.uniq.freeze end
Instance Method Summary collapse
-
#initialize(workspace:, ephemeral_overlay: []) ⇒ Bubblewrap
constructor
A new instance of Bubblewrap.
-
#wrap(argv) ⇒ Array<String>
bwrap+ isolation flags + bind-mounts +argv, ready to hand to Subprocess.spawn.
Constructor Details
#initialize(workspace:, ephemeral_overlay: []) ⇒ Bubblewrap
Returns a new instance of Bubblewrap.
299 300 301 302 303 304 305 306 307 |
# File 'lib/pikuri/code/bash/sandbox.rb', line 299 def initialize(workspace:, ephemeral_overlay: []) @workspace = workspace @ephemeral_overlay = .map { |p| Pathname.new(p).realpath }.uniq reject_unbounded_workspace! reject_unaliased_temp! reject_container_socket_exposure! check_bwrap! end |
Instance Method Details
#wrap(argv) ⇒ Array<String>
Returns bwrap + isolation flags + bind-mounts + argv, ready to hand to Subprocess.spawn.
314 315 316 |
# File 'lib/pikuri/code/bash/sandbox.rb', line 314 def wrap(argv) [BWRAP_BINARY, *bwrap_args, *argv] end |