Class: Pikuri::Code::GitClone
- Inherits:
-
Tool
- Object
- Tool
- Pikuri::Code::GitClone
- Defined in:
- lib/pikuri/code/git_clone.rb
Overview
The git_clone tool — shallow-clone a public git repository into the workspace. Instantiating Code::GitClone.new(workspace: ws) produces a tool whose Tool#to_ruby_llm_tool wiring is identical to any bundled tool’s; execute closes over the workspace and a lazily-minted Bash::Sandbox::Bubblewrap.
Why this exists
The bundled researcher persona can web_search / web_scrape / fetch, which is great for “look up one fact” but inefficient when the task is “dig through opencode’s source for how it does X.” The pattern *N pages of HTML scraping* is much worse than *one shallow clone + grep*. This tool plus GIT_REPO_RESEARCHER (the persona that wires it together with workspace-scoped read/grep/glob) is the answer.
Threat model
Git clone is not “just reading files.” Hostile upstream has a history of RCEs:
-
CVE-2024-32002 — submodule + symlink + case-insensitive FS escape → RCE.
-
CVE-2022-39253 —
--localclone reading arbitrary host files via symlinks. -
CVE-2017-1000117 —
ssh://URL arg injection (+ssh://-oProxyCommand=…+) → arbitrary command execution. -
.gitattributesfilter drivers,.git/configcore.fsmonitor/core.sshCommand— code paths that run during clone / checkout.
Mitigations baked in here:
-
**HTTPS/HTTP only.** VALID_SCHEMES is %w[https http];
ssh://,git://,file://,ext::, and anything else are refused at the tool layer beforegitsees the string. -
**No submodule recursion.**
--no-recurse-submoduleskills the CVE-2024-32002 class. -
**Shallow clone.** –depth 1 skips history (fewer ref parsing edge cases, faster, smaller).
-
**Bubblewrap-sandboxed subprocess.** The
gitbinary runs inside Bash::Sandbox::Bubblewrap bound to the persona’s fresh temp workspace — no host ~/.ssh, no ~/.gitconfig, no other projects’ source, no container sockets. A clone-RCE blast radius is the persona’s throwaway workspace.
The Bubblewrap instance is minted lazily on first execute, not at construction — the boot-time GitClone wired by bin/pikuri-code never runs (it lives in the sub-agent-only pool), and gets replaced by a fresh-workspace clone via #with_workspace the moment a git_repo_researcher session starts. Eager construction would pay the ~bwrap probe cost on every coding-agent boot for no reason.
Output
On success: a one-line ack with the relative path inside the workspace. The persona then uses read / grep / glob to explore the clone.
On failure: “Error: …” in the usual pikuri convention. Possible causes: refused URL scheme, malformed URI, network failure, target dir already exists, git non-zero exit.
Constant Summary collapse
- VALID_SCHEMES =
URL schemes accepted.
httpsfirst (TLS) andhttpas a fallback for the rare public mirror. All other schemes are refused — see the threat-model header. %w[https http].freeze
- TIMEOUT_SECONDS =
Hard cap on the subprocess timeout (seconds). Real-world shallow clones of medium repos finish in seconds; this is the ceiling for a slow network or a large repo, after which we SIGTERM.
120- DESCRIPTION =
<<~DESC Shallow-clone a public git repository into your workspace. Usage: - URL must be `https://` (preferred) or `http://`. Any other scheme (`ssh://`, `git://`, `file://`) is refused. - Always cloned with `--depth 1 --no-recurse-submodules`; you get the current tip, no history, no submodules. - Target directory name is derived from the URL's last segment (without `.git`). If that directory already exists, the call fails — pick a different URL or work with what you cloned. - On success returns the relative path to the cloned repo; use `read`, `grep`, `glob` to navigate it. - Clones run inside a sandbox bound to your workspace — host files, SSH keys, and `~/.gitconfig` are NOT visible to the cloned repo's hooks/filters. DESC
Instance Method Summary collapse
- #initialize(workspace:, sandbox: nil) ⇒ GitClone constructor
-
#with_workspace(workspace) ⇒ GitClone
Produce a new GitClone bound to
workspace.
Constructor Details
#initialize(workspace:, sandbox: nil) ⇒ GitClone
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
# File 'lib/pikuri/code/git_clone.rb', line 104 def initialize(workspace:, sandbox: nil) @workspace = workspace @sandbox = sandbox super( name: 'git_clone', description: DESCRIPTION, parameters: Pikuri::Tool::Parameters.build { |p| p.required_string :url, 'HTTPS (or HTTP) git URL to clone, e.g. ' \ '"https://github.com/anomalyco/opencode" or ' \ '"https://github.com/anomalyco/opencode.git". ' \ 'Other schemes are refused.' }, execute: ->(url:) { execute_clone(url: url) } ) end |
Instance Method Details
#with_workspace(workspace) ⇒ GitClone
Produce a new Pikuri::Code::GitClone bound to workspace. The sandbox is NOT carried over — the new instance lazily mints a fresh Bubblewrap from the new workspace, since a sandbox’s bind set depends on the workspace it constrains. See class header.
128 129 130 |
# File 'lib/pikuri/code/git_clone.rb', line 128 def with_workspace(workspace) self.class.new(workspace: workspace) end |