Class: SkillBench::Tools::RunCommand

Inherits:
Object
  • Object
show all
Defined in:
lib/skill_bench/tools/run_command.rb

Overview

Handles executing a shell command within the working directory.

Class Method Summary collapse

Class Method Details

.call(command, working_dir_path, container_id = nil) ⇒ String

Executes a shell command within the working directory (host or container).

Tokenizes the command string before execution so that arguments are passed directly to the OS without shell interpretation, preventing shell injection.

Parameters:

  • command (String)

    The command to run (e.g. “rspec spec/models”).

  • working_dir_path (Pathname)

    The host directory (ignored if container_id present).

  • container_id (String, nil) (defaults to: nil)

    The Docker container ID for isolated execution.

Returns:

  • (String)

    A formatted string containing the exit status, STDOUT, and STDERR.

Raises:

  • (Timeout::Error)

    Internally rescued; returns a timeout message string.



42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# File 'lib/skill_bench/tools/run_command.rb', line 42

def self.call(command, working_dir_path, container_id = nil)
  argv = command.shellsplit
  return 'Error: Empty command.' if argv.empty?

  base_cmd = argv.first
  return "Error: Command '#{base_cmd}' is blocked for security reasons." if Constants::Tools::DANGEROUS_COMMANDS.include?(base_cmd)

  allowed = SkillBench::Config.allowed_commands
  return 'Error: No allowed commands configured. Set allowed_commands in skill-bench.json or use --mode mock.' if allowed.nil?
  return "Error: Command '#{base_cmd}' is not permitted." unless allowed.include?(base_cmd)

  max_time = SkillBench::Config.max_execution_time
  Timeout.timeout(max_time) do
    stdout_str, stderr_str, status = if container_id
                                       docker_cmd = ['docker', 'exec', '-w', '/sandbox', container_id] + argv
                                       Open3.capture3(*docker_cmd)
                                     else
                                       Open3.capture3(*argv, chdir: working_dir_path.to_s)
                                     end
    <<~RESULT
      Exit Status: #{status.exitstatus}
      STDOUT:
      #{stdout_str}
      STDERR:
      #{stderr_str}
    RESULT
  end
rescue Timeout::Error
  "Error: Command execution timed out after #{max_time} seconds."
end

.definitionHash

Returns The tool definition for the LLM API.

Returns:

  • (Hash)

    The tool definition for the LLM API.



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# File 'lib/skill_bench/tools/run_command.rb', line 14

def self.definition
  {
    type: 'function',
    function: {
      name: 'run_command',
      description: 'Execute a shell command (e.g., rspec).',
      parameters: {
        type: 'object',
        properties: {
          command: { type: 'string', description: 'The shell command to run.' }
        },
        required: ['command'],
        additionalProperties: false
      }
    }
  }
end