Class: SkillBench::Agent::Runner

Inherits:

Object

Object
SkillBench::Agent::Runner

show all

Defined in:: lib/skill_bench/agent/runner.rb

Overview

Responsible for executing a specific scenario (baseline or context-hydrated) within an isolated sandbox. Handles the system prompt generation and agent execution.

Class Method Summary collapse

.call(params) ⇒ Array<String, String>

Executes the agent run scenario.

Instance Method Summary collapse

#call ⇒ Array<String, String>

Runs the evaluation scenario and captures the results.
#initialize(params) ⇒ Runner constructor

A new instance of Runner.

Constructor Details

#initialize(params) ⇒ `Runner`

Returns a new instance of Runner.

Parameters:

params (Hash) —

The configuration parameters for the run.

# File 'lib/skill_bench/agent/runner.rb', line 27

def initialize(params)
  @mode = validate_mode(params.fetch(:mode))
  @full_eval_path = params.fetch(:full_eval_path)
  @task_content = params.fetch(:task_content)
  @client_params = params.fetch(:client_params, {})

  @source_path = params[:source_path]
  @base_path = params[:base_path]
end

Class Method Details

.call(params) ⇒ `Array<String, String>`

Executes the agent run scenario.

Parameters:

params (Hash) —

The configuration parameters for the run.

Options Hash (params):

:mode (Symbol) —

The mode to run in (‘:baseline` or `:context`).
:full_eval_path (Pathname) —

The path to the evaluation directory.
:task_content (String) —

The task description.
:client_params (Hash) —

Parameters for the LLM client.
:source_path (String) —

Required if mode is ‘:context`.
:base_path (Pathname) —

Required if mode is ‘:context`.

Returns:

(Array<String, String>) —

The agent’s final answer and the git diff.



22
23
24

# File 'lib/skill_bench/agent/runner.rb', line 22

def self.call(params)
  new(params).call
end

Instance Method Details

#call ⇒ `Array<String, String>`

Runs the evaluation scenario and captures the results.