Class: Pikuri::Tool::Grep
- Inherits:
-
Pikuri::Tool
- Object
- Pikuri::Tool
- Pikuri::Tool::Grep
- Defined in:
- lib/pikuri/tool/grep.rb
Overview
The grep tool — content search across the workspace via ripgrep. Instantiating Tool::Grep.new(workspace: ws) produces a tool whose #to_ruby_llm_tool wiring is identical to any bundled tool’s. Same shape as Read (workspace captured by the execute closure, no confirmer — search is read-only).
ripgrep dependency
Hard dependency: Grep.check_binaries! runs in initialize and raises if rg isn’t on PATH. Mirrors Bash‘s posture for bash/timeout. We don’t ship a Ruby fallback — replicating rg’s Rust-regex dialect, glob handling, and .gitignore parsing is a research-loop dead end. Failure message includes the install hint.
Argv
rg --line-number --color=never --no-heading --with-filename \
--hidden --max-columns=2000 --max-columns-preview \
--sort=path \
[-i] [--glob <g>] [--files-with-matches|--count-matches] \
-- <pattern> <relative-path-or-dot>
-
--no-heading+--with-filename→ flatpath:line:contentrows regardless of whether the search target is a directory or a single file (rg defaults to suppressing the filename for single-file searches — we force it on for output consistency). -
--hidden→ search dotfiles (still respects.gitignore). -
–max-columns=2000 –max-columns-preview → rg truncates lines longer than MAX_LINE_LENGTH bytes server-side and appends a preview marker, sparing us per-line truncation.
-
–sort=path → deterministic output (single-threaded; fine for typical repos under ~10k files). Makes specs assertable and gives the model a stable order to scan.
-
Subprocess runs with chdir: workspace.cwd and is always given an explicit path argument. Subprocess.spawn uses
popen2ewhich gives the child a piped (non-tty) stdin; rg’s default heuristic on no-path-arg-with-piped-stdin is to search stdin (which we then close — yielding zero matches). Passing the path argument explicitly bypasses the heuristic. Output paths come back as./...when the path is.; the leading./is stripped post-rg so the model sees clean workspace-relative paths.
Output modes
-
content(default) —path:line:contentrows. -
files_with_matches— just file paths, one per line. -
count—path:countper file.
Use files_with_matches to scope a broad search cheaply before paying tokens for content.
Truncation
Total output is head-truncated to MAX_BYTES (head-only — grep tails usually carry less signal than the first matches; opposite bias from Bash). Cut at the last line boundary, with a marker reporting omitted bytes and the original total so the model knows how much it missed.
Exit codes
-
0→ matches; format with footer. -
1→ no matches; return “No matches for pattern ‘…’”. -
2→ rg error (bad regex, missing path); return “Error: ripgrep: …”.
Refusals
All returned as “Error: …” observations:
-
Empty
pattern→ fast reject. -
Unknown
output_mode→ enum error listing valid values. -
Path outside the workspace → caught from Workspace::Error.
-
Nonexistent path → “Error: path not found: <path>”.
Constant Summary collapse
- MAX_BYTES =
Returns hard byte cap on combined rg output. Same value as Read::MAX_BYTES so the two file-touching tools share a budget shape.
50 * 1024
- MAX_BYTES_LABEL =
Returns human-readable form of MAX_BYTES for the truncation marker.
"#{MAX_BYTES / 1024} KB"- MAX_LINE_LENGTH =
Returns per-line cap passed to rg’s
--max-columns. Long lines are truncated server-side with a preview marker. 2000- OUTPUT_MODES =
Returns valid
output_modevalues. %w[content files_with_matches count].freeze
- DEFAULT_OUTPUT_MODE =
Returns default
output_mode. 'content'- DESCRIPTION =
Description shown to the LLM. opencode-shape (summary +
Usage:bullets). Per-parameter constraints live in parameter descriptions. <<~DESC Search file contents for a regex pattern across the workspace. Usage: - Wraps `ripgrep` — regex syntax is rg's Rust-regex dialect (mostly PCRE-compatible; no lookbehind). - Default search root is the workspace root; pass `path` to narrow to a file or subdirectory. - Respects `.gitignore` — for unfiltered search use bash `rg --no-ignore <pattern>`. - Use `glob` to filter by filename, e.g. `"*.rb"` or `"src/**/*.{ts,tsx}"`. - `output_mode` controls verbosity: `content` (default, file:line:text), `files_with_matches` (paths only), `count` (matches per file). - Use `files_with_matches` first to scope a broad search, then `content` (or `read`) to investigate — saves tokens. - Output is truncated to #{MAX_BYTES_LABEL}; refine the pattern or narrow `path` if the response ends in a truncation marker. - Long lines are truncated to #{MAX_LINE_LENGTH} chars with a preview marker; use `read` to see full lines. DESC
Constants inherited from Pikuri::Tool
CALCULATOR, FETCH, WEB_SCRAPE, WEB_SEARCH
Instance Attribute Summary
Attributes inherited from Pikuri::Tool
#description, #execute, #name, #parameters
Class Method Summary collapse
-
.search(workspace:, pattern:, path:, glob:, case_insensitive:, output_mode:) ⇒ String
Validate inputs, resolve the path against the workspace, spawn rg, and render the observation.
Instance Method Summary collapse
- #initialize(workspace:) ⇒ Grep constructor
Methods inherited from Pikuri::Tool
Constructor Details
#initialize(workspace:) ⇒ Grep
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
# File 'lib/pikuri/tool/grep.rb', line 124 def initialize(workspace:) Grep.send(:check_binaries!) super( name: 'grep', description: DESCRIPTION, parameters: Parameters.build { |p| p.required_string :pattern, 'Regex pattern to search for (rg Rust-regex ' \ 'dialect), e.g. "def\s+\w+" or "TODO".' p.optional_string :path, 'File or directory to search. Relative paths ' \ 'resolve against the workspace root. Defaults ' \ 'to the workspace root, e.g. "lib/" or "lib/foo.rb".' p.optional_string :glob, 'Filename glob to filter files, e.g. "*.rb" ' \ 'or "src/**/*.{ts,tsx}".' p.optional_boolean :case_insensitive, 'Match case-insensitively. Defaults to false, e.g. true.' p.optional_string :output_mode, "One of #{OUTPUT_MODES.join(', ')}. Defaults to " \ "#{DEFAULT_OUTPUT_MODE}, e.g. \"files_with_matches\"." }, execute: lambda { |pattern:, path: nil, glob: nil, case_insensitive: false, output_mode: DEFAULT_OUTPUT_MODE| Grep.search(workspace: workspace, pattern: pattern, path: path, glob: glob, case_insensitive: case_insensitive, output_mode: output_mode) } ) end |
Class Method Details
.search(workspace:, pattern:, path:, glob:, case_insensitive:, output_mode:) ⇒ String
Validate inputs, resolve the path against the workspace, spawn rg, and render the observation. Returns either the formatted results, a “no matches” string, or “Error: …”.
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
# File 'lib/pikuri/tool/grep.rb', line 166 def self.search(workspace:, pattern:, path:, glob:, case_insensitive:, output_mode:) return 'Error: empty pattern.' if pattern.empty? unless OUTPUT_MODES.include?(output_mode) return "Error: output_mode must be one of #{OUTPUT_MODES.join(', ')}, " \ "got #{output_mode.inspect}." end search_target = '.' if path resolved = workspace.resolve_for_read(path) return "Error: path not found: #{path}" unless resolved.exist? rel = resolved.relative_path_from(workspace.cwd).to_s search_target = rel end argv = build_argv(pattern: pattern, glob: glob, case_insensitive: case_insensitive, output_mode: output_mode, path: search_target) result = Pikuri::Subprocess.spawn(*argv, chdir: workspace.cwd.to_s).wait exit_code = result.status.exitstatus case exit_code when 0 format_output(result.output, output_mode: output_mode, pattern: pattern, path: path) when 1 (pattern: pattern, path: path) else stderr = result.output.strip stderr = "exited #{exit_code}" if stderr.empty? "Error: ripgrep: #{stderr}" end rescue Tool::Workspace::Error => e "Error: #{e.}" end |