Class: SkillBench::Task::Evaluator Deprecated

Inherits:

Object

Object
SkillBench::Task::Evaluator

show all

Defined in:: lib/skill_bench/task/evaluator.rb

Overview

Deprecated.

Evaluates a single task by running baseline and context-hydrated evaluations. Orchestrates Agent::Runner calls and Judge::Judge scoring.

Class Method Summary collapse

.call(full_eval_path:, base_path:, skill_path: nil, client_params: {}) ⇒ Hash

Evaluates a single task.

Instance Method Summary collapse

#call ⇒ Hash

Executes the task evaluation.
#initialize(full_eval_path:, base_path:, skill_path:, client_params:) ⇒ Evaluator constructor

A new instance of Evaluator.

Constructor Details

#initialize(full_eval_path:, base_path:, skill_path:, client_params:) ⇒ `Evaluator`

Returns a new instance of Evaluator.

Parameters:

full_eval_path (Pathname) —

The path to the evaluation directory.
base_path (Pathname) —

The base path for relative file resolution.
skill_path (String, nil) —

Optional override for the source directory.
client_params (Hash) —

Parameters to pass to the LLM client.

# File 'lib/skill_bench/task/evaluator.rb', line 33

def initialize(full_eval_path:, base_path:, skill_path:, client_params:)
  @full_eval_path = full_eval_path
  @base_path = base_path
  @skill_path = skill_path
  @client_params = client_params
end

Class Method Details

.call(full_eval_path:, base_path:, skill_path: nil, client_params: {}) ⇒ `Hash`

Evaluates a single task.

Parameters:

full_eval_path (Pathname) —

The path to the evaluation directory.
base_path (Pathname) —

The base path for relative file resolution.
skill_path (String, nil) (defaults to: nil) —

Optional override for the source directory.
client_params (Hash) (defaults to: {}) —

Parameters to pass to the LLM client.

Returns:

(Hash) —

The result of the task evaluation.



25
26
27

# File 'lib/skill_bench/task/evaluator.rb', line 25

def self.call(full_eval_path:, base_path:, skill_path: nil, client_params: {})
  new(full_eval_path:, base_path:, skill_path:, client_params:).call
end

Instance Method Details

#call ⇒ `Hash`

Executes the task evaluation.