Class: SkillBench::Judge::Prompt

Inherits:
Object
  • Object
show all
Defined in:
lib/skill_bench/judge/prompt.rb

Overview

Builds structured prompts for the LLM judge.

Assembles task description, evaluation criteria, skill context, and agent output into a single prompt for blind scoring.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(task:, criteria:, skill_context:, agent_output:) ⇒ Prompt

Returns a new instance of Prompt.

Parameters:

  • task (String)

    The task description.

  • criteria (SkillBench::Criteria)

    The eval criteria.

  • skill_context (String, nil)

    The skill context XML (nil for baseline runs).

  • agent_output (String)

    The agent output.



25
26
27
28
29
30
# File 'lib/skill_bench/judge/prompt.rb', line 25

def initialize(task:, criteria:, skill_context:, agent_output:)
  @task = task
  @criteria = criteria
  @skill_context = skill_context
  @agent_output = agent_output
end

Class Method Details

.call(task:, criteria:, skill_context:, agent_output:) ⇒ Hash

Builds the judge prompt.

Parameters:

  • task (String)

    The task description from task.md.

  • criteria (SkillBench::Criteria)

    The eval criteria with dimensions.

  • skill_context (String, nil)

    XML-wrapped skill context (nil for baseline runs).

  • agent_output (String)

    Git diff and agent summary.

Returns:

  • (Hash)

    Service response with prompt or error.



17
18
19
# File 'lib/skill_bench/judge/prompt.rb', line 17

def self.call(task:, criteria:, skill_context:, agent_output:)
  new(task:, criteria:, skill_context:, agent_output:).call
end

Instance Method Details

#callHash

Assembles and returns the judge prompt.

Returns:

  • (Hash)

    Service response with prompt or error.



35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/skill_bench/judge/prompt.rb', line 35

def call
  return missing_task_result if task.nil? || task.strip.empty?
  return missing_criteria_result if criteria.nil?
  return missing_agent_output_result if agent_output.nil? || agent_output.to_s.strip.empty?
  return missing_skill_context_result unless valid_skill_context?

  prompt = assemble_prompt
  { success: true, response: { prompt: prompt } }
rescue StandardError => e
  SkillBench::ErrorLogger.log_error(e, 'Judge::Prompt Build Error')
  { success: false, response: { error: { message: e.message } } }
end