Class: Pikuri::Tool::Glob

Inherits:
Pikuri::Tool show all
Defined in:
lib/pikuri/tool/glob.rb

Overview

The glob tool — list files matching a glob pattern via rg –files, sorted by modification time (newest first). Instantiating Tool::Glob.new(workspace: ws) produces a tool whose #to_ruby_llm_tool wiring is identical to any bundled tool’s. Same shape as Grep (workspace captured by the execute closure, no confirmer — read-only).

Why a separate tool from Grep

The unique capability is *mtime-descending sort* — “what’s been touched recently” is a common navigation move and Grep can’t express it. The rest (filter by name, default to listing all matching files) is theoretically reachable through Grep with pattern=“.”, but Glob avoids that hack and keeps Read / Grep / Glob as three clean roles: read one file, search content, list files by name.

ripgrep dependency

Hard dependency: Glob.check_binaries! runs in initialize and raises if rg isn’t on PATH. Each tool owns its own probe so construction order doesn’t matter — Glob doesn’t lean on Grep’s check.

Argv & filter pipeline

rg --files --color=never --hidden --glob '!.git/*' \
   -- <relative-path-or-dot>
# …then filter the result list in Ruby with File.fnmatch?

Why not pass the user pattern as --glob to rg? Because rg’s --glob documentation says *“This always overrides any other ignore logic”* — so –glob ‘*/.rb’ would re-include .gitignore‘d Ruby files, breaking our gitignore-respect promise. We let rg produce the full gitignore-respecting file list and filter to the user’s pattern in Ruby with File.fnmatch?(pattern, p, FNM_PATHNAME | FNM_EXTGLOB | FNM_DOTMATCH). The three flags together cover the common rg glob cases: ** recursion (FNM_PATHNAME), {a,b} alternation (FNM_EXTGLOB), and dotfile inclusion (FNM_DOTMATCH, matching rg’s --hidden behavior). The .git/ exclusion stays on the rg side so its contents never even reach the Ruby filter.

  • --hidden → search dotfiles (still respects .gitignore).

  • No --sort flag: we re-sort by mtime in Ruby on the way out.

  • Output paths come back as ./... when the search path is .; the leading ./ is stripped post-rg so the model sees clean workspace-relative paths.

Sort

mtime-descending in Ruby after rg returns, with path-ascending as a tiebreaker for files with equal mtimes (the common case in fresh checkouts). Cost: one stat per result. Broad patterns can make this expensive, but in practice rg’s .gitignore filter keeps result sets bounded; if real friction shows up later we can cap pre-sort.

Truncation

Total output head-truncated to MAX_BYTES after mtime sort, so the kept rows are the newest. Matches Grep‘s budget and head-bias.

Exit codes

  • 0 → at least one file; format with footer.

  • 1 → no files; return “No files match pattern ‘…’”.

  • 2 → rg error (bad path, bad glob); return “Error: ripgrep: …”.

Refusals

All returned as “Error: …” observations:

  • Empty pattern → fast reject.

  • path is a regular file → fast reject pointing at the read tool.

  • path not found → “Error: path not found: <path>”.

  • path outside the workspace → caught from Workspace::Error.

Constant Summary collapse

MAX_BYTES =

Returns hard byte cap on combined rg output. Same value as Pikuri::Tool::Grep::MAX_BYTES so the two file-touching tools share a budget shape. Re-declared here rather than referenced cross-file because Zeitwerk’s eager-load order isn’t guaranteed between siblings.

Returns:

  • (Integer)

    hard byte cap on combined rg output. Same value as Pikuri::Tool::Grep::MAX_BYTES so the two file-touching tools share a budget shape. Re-declared here rather than referenced cross-file because Zeitwerk’s eager-load order isn’t guaranteed between siblings.

50 * 1024
MAX_BYTES_LABEL =

Returns human-readable form of MAX_BYTES for the truncation marker.

Returns:

  • (String)

    human-readable form of MAX_BYTES for the truncation marker.

"#{MAX_BYTES / 1024} KB"
DESCRIPTION =

Description shown to the LLM. opencode-shape (summary + Usage: bullets). Per-parameter constraints live in parameter descriptions.

Returns:

  • (String)
<<~DESC
  List files matching a glob pattern, sorted by modification time (newest first).

  Usage:
  - `.gitignore` is respected; for unfiltered listing use bash `rg --no-ignore --files -g <pattern>`.
  - Glob syntax: `**` matches any number of directories, `*` matches any filename chars (not `/`), `{a,b}` is alternation.
  - Default search root is the workspace root; pass `path` to narrow to a subdirectory.
  - Use `glob` to find files by name; use `grep` to find files by content.
  - Output is sorted by mtime descending — recently-touched files come first, so broad patterns still surface relevant files near the top.
  - Output is truncated to #{MAX_BYTES_LABEL}; refine the pattern or narrow `path` if the response ends in a truncation marker.
DESC
FNMATCH_FLAGS =

Returns flags for File.fnmatch?: FNM_PATHNAME for ** recursion + path-aware / matching, FNM_EXTGLOB for {a,b} alternation, FNM_DOTMATCH to match dotfiles (rg does this when --hidden is set).

Returns:

  • (Integer)

    flags for File.fnmatch?: FNM_PATHNAME for ** recursion + path-aware / matching, FNM_EXTGLOB for {a,b} alternation, FNM_DOTMATCH to match dotfiles (rg does this when --hidden is set).

File::FNM_PATHNAME | File::FNM_EXTGLOB | File::FNM_DOTMATCH

Constants inherited from Pikuri::Tool

CALCULATOR, FETCH, WEB_SCRAPE, WEB_SEARCH

Instance Attribute Summary

Attributes inherited from Pikuri::Tool

#description, #execute, #name, #parameters

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Pikuri::Tool

#run, #to_ruby_llm_tool

Constructor Details

#initialize(workspace:) ⇒ Glob

Parameters:

  • workspace (Tool::Workspace)

    captured for path resolution and as chdir for rg. All path arguments route through workspace.resolve_for_read.

Raises:

  • (RuntimeError)

    if rg isn’t on PATH; fail-loud at construction rather than the first tool call.



121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
# File 'lib/pikuri/tool/glob.rb', line 121

def initialize(workspace:)
  Glob.send(:check_binaries!)
  super(
    name: 'glob',
    description: DESCRIPTION,
    parameters: Parameters.build { |p|
      p.required_string :pattern,
                        'Glob pattern (** matches any number of ' \
                        'directories; {a,b} alternation), e.g. ' \
                        '"**/*.rb" or "lib/**/*_spec.rb".'
      p.optional_string :path,
                        'Directory to search in. Relative paths resolve ' \
                        'against the workspace root. Defaults to the ' \
                        'workspace root, e.g. "lib/" or "spec/".'
    },
    execute: lambda { |pattern:, path: nil|
      Glob.search(workspace: workspace, pattern: pattern, path: path)
    }
  )
end

Class Method Details

.search(workspace:, pattern:, path:) ⇒ String

Validate inputs, resolve the path against the workspace, spawn rg, mtime-sort, head-truncate, render. Returns either the formatted listing, a “no files match” message, or “Error: …”.

Parameters:

Returns:

  • (String)


151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
# File 'lib/pikuri/tool/glob.rb', line 151

def self.search(workspace:, pattern:, path:)
  return 'Error: empty pattern.' if pattern.empty?

  search_target = '.'
  if path
    resolved = workspace.resolve_for_read(path)
    return "Error: path not found: #{path}" unless resolved.exist?
    if resolved.file?
      return "Error: #{path} is a file, not a directory; use the read tool to view it."
    end

    rel = resolved.relative_path_from(workspace.cwd).to_s
    search_target = rel
  end

  argv = build_argv(path: search_target)
  result = Pikuri::Subprocess.spawn(*argv, chdir: workspace.cwd.to_s).wait
  exit_code = result.status.exitstatus

  case exit_code
  when 0
    format_output(result.output, workspace: workspace,
                  pattern: pattern, path: path)
  when 1
    no_match_message(pattern: pattern, path: path)
  else
    stderr = result.output.strip
    stderr = "exited #{exit_code}" if stderr.empty?
    "Error: ripgrep: #{stderr}"
  end
rescue Tool::Workspace::Error => e
  "Error: #{e.message}"
end