Class: SkillBench::Tools::RunCommand
- Inherits:
-
Object
- Object
- SkillBench::Tools::RunCommand
- Defined in:
- lib/skill_bench/tools/run_command.rb
Overview
Handles executing a shell command within the working directory.
Constant Summary collapse
- DANGEROUS_COMMANDS =
Commands that are always blocked even if listed in allowed_commands, because they can be used to escape the sandbox or execute arbitrary code.
%w[ bash sh zsh fish dash ksh csh tcsh python python3 python2 ruby perl node php lua tcl wish curl wget nc ncat socat eval exec sudo su doas chmod chown mount umount dd mkfs fdisk parted insmod rmmod modprobe systemctl service passwd useradd userdel groupadd groupdel ].freeze
Class Method Summary collapse
-
.call(command, working_dir_path, container_id = nil) ⇒ String
Executes a shell command within the working directory (host or container).
-
.definition ⇒ Hash
The tool definition for the LLM API.
Class Method Details
.call(command, working_dir_path, container_id = nil) ⇒ String
Executes a shell command within the working directory (host or container).
Tokenizes the command string before execution so that arguments are passed directly to the OS without shell interpretation, preventing shell injection.
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
# File 'lib/skill_bench/tools/run_command.rb', line 57 def self.call(command, working_dir_path, container_id = nil) argv = command.shellsplit return 'Error: Empty command.' if argv.empty? base_cmd = argv.first return "Error: Command '#{base_cmd}' is blocked for security reasons." if DANGEROUS_COMMANDS.include?(base_cmd) allowed = SkillBench::Config.allowed_commands return 'Error: No allowed commands configured. Set allowed_commands in skill-bench.json or use --mode mock.' if allowed.nil? return "Error: Command '#{base_cmd}' is not permitted." unless allowed.include?(base_cmd) max_time = SkillBench::Config.max_execution_time Timeout.timeout(max_time) do stdout_str, stderr_str, status = if container_id docker_cmd = ['docker', 'exec', '-w', '/sandbox', container_id] + argv Open3.capture3(*docker_cmd) else Open3.capture3(*argv, chdir: working_dir_path.to_s) end <<~RESULT Exit Status: #{status.exitstatus} STDOUT: #{stdout_str} STDERR: #{stderr_str} RESULT end rescue Timeout::Error "Error: Command execution timed out after #{max_time} seconds." end |
.definition ⇒ Hash
Returns The tool definition for the LLM API.
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
# File 'lib/skill_bench/tools/run_command.rb', line 29 def self.definition { type: 'function', function: { name: 'run_command', description: 'Execute a shell command (e.g., rspec).', parameters: { type: 'object', properties: { command: { type: 'string', description: 'The shell command to run.' } }, required: ['command'], additionalProperties: false } } } end |