Class: SkillBench::Runner Deprecated

Inherits:
Object
  • Object
show all
Defined in:
lib/skill_bench/runner.rb

Overview

Deprecated.

Orchestrates the entire evaluation process. Compares how an AI coding agent performs with and without contextual skills.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(params) ⇒ Runner

Returns a new instance of Runner.

Parameters:

  • params (Hash)

    The configuration for the evaluation.



27
28
29
30
31
32
# File 'lib/skill_bench/runner.rb', line 27

def initialize(params)
  @eval_folder_path = params[:eval_folder_path]
  @skill_path = params[:skill_path]
  @base_path = params[:base_path] || Pathname.new(Dir.pwd)
  @client_params = params[:client_params] || {}
end

Class Method Details

.call(params) ⇒ Hash

Initiates a full evaluation run.

Parameters:

  • params (Hash)

    The configuration for the evaluation.

Options Hash (params):

  • :eval_folder_path (String)

    The path to the evaluation directory containing task and criteria.

  • :skill_path (String)

    Optional override for the source directory being tested.

  • :base_path (String, Pathname) — default: optional

    The base path for relative file resolution.

  • :client_params (Hash) — default: optional

    Parameters to pass to the LLM client.

Returns:

  • (Hash)

    A result hash with :success and :response payload containing the judge scores and diffs.

Raises:

  • (ArgumentError)

    If the eval path does not match a supported source-path convention.



22
23
24
# File 'lib/skill_bench/runner.rb', line 22

def self.call(params)
  new(params).call
end

.discover_task_dirs(root_path) ⇒ Array<Pathname>

Finds all directories containing a task.md file starting from the root_path.

Parameters:

  • root_path (Pathname)

    The root directory to search.

Returns:

  • (Array<Pathname>)

    A list of task directory paths.



81
82
83
84
85
86
87
# File 'lib/skill_bench/runner.rb', line 81

def self.discover_task_dirs(root_path)
  if File.exist?(root_path.join('task.md'))
    [root_path]
  else
    Dir.glob(root_path.join('**/task.md')).map { |f| Pathname.new(f).parent }.uniq.sort
  end
end

Instance Method Details

#callHash

Executes the baseline and context-hydrated evaluations, then scores them.

Returns:

  • (Hash)

    The final evaluation result.



37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# File 'lib/skill_bench/runner.rb', line 37

def call
  full_path = @base_path.join(@eval_folder_path)

  return { success: false, response: { error: { message: "Evaluation path #{full_path} does not exist" } } } unless full_path.exist?

  task_dirs = self.class.discover_task_dirs(full_path)
  if task_dirs.empty?
    return { success: false,
             response: { error: { message: "No task.md found in #{full_path} or its subdirectories" } } }
  end

  results = Parallel.map(task_dirs, in_threads: 4) do |task_dir|
    task_result = Task::Evaluator.call(
      full_eval_path: task_dir,
      base_path: @base_path,
      skill_path: @skill_path,
      client_params: @client_params
    )
    # Normalize to uniform envelope
    if task_result.key?(:success)
      task_result
    else
      { success: true, response: task_result }
    end
  end

  overall_success = results.all? { |task_result| task_result[:success] }

  {
    success: overall_success,
    response: {
      source_path: @skill_path || 'multiple (batch run)',
      tasks: results
    }
  }
rescue StandardError => e
  SkillBench::ErrorLogger.log_error(e, 'Runner Error')
  { success: false, response: { error: { message: e.message } } }
end