Class: SkillBench::Judge::Prompt

Inherits:

Object

Object
SkillBench::Judge::Prompt

show all

Defined in:: lib/skill_bench/judge/prompt.rb

Overview

Builds structured prompts for the LLM judge.

Assembles task description, evaluation criteria, skill context, and agent output into a single prompt for blind scoring.

Class Method Summary collapse

.call(task:, criteria:, skill_context:, agent_output:) ⇒ Hash

Builds the judge prompt.

Instance Method Summary collapse

#call ⇒ Hash

Assembles and returns the judge prompt.
#initialize(task:, criteria:, skill_context:, agent_output:) ⇒ Prompt constructor

A new instance of Prompt.

Constructor Details

#initialize(task:, criteria:, skill_context:, agent_output:) ⇒ `Prompt`

Returns a new instance of Prompt.

Parameters:

task (String) —

The task description.
criteria (SkillBench::Criteria) —

The eval criteria.
skill_context (String, nil) —

The skill context XML (nil for baseline runs).
agent_output (String) —

The agent output.

# File 'lib/skill_bench/judge/prompt.rb', line 25

def initialize(task:, criteria:, skill_context:, agent_output:)
  @task = task
  @criteria = criteria
  @skill_context = skill_context
  @agent_output = agent_output
end

Class Method Details

.call(task:, criteria:, skill_context:, agent_output:) ⇒ `Hash`

Builds the judge prompt.

Parameters:

task (String) —

The task description from task.md.
criteria (SkillBench::Criteria) —

The eval criteria with dimensions.
skill_context (String, nil) —

XML-wrapped skill context (nil for baseline runs).
agent_output (String) —

Git diff and agent summary.

Returns:

(Hash) —

Service response with prompt or error.



17
18
19

# File 'lib/skill_bench/judge/prompt.rb', line 17

def self.call(task:, criteria:, skill_context:, agent_output:)
  new(task:, criteria:, skill_context:, agent_output:).call
end

Instance Method Details

#call ⇒ `Hash`

Assembles and returns the judge prompt.