Class: Riffer::Evals::Evaluator

Inherits:
Object
  • Object
show all
Defined in:
lib/riffer/evals/evaluator.rb

Overview

Base class for all evaluators. Set instructions and the base class calls the judge automatically; override #evaluate for custom logic. See examples/evaluators/ for reference implementations.

class MyEvaluator < Riffer::Evals::Evaluator
  instructions "Assess medical accuracy of the response..."
  higher_is_better true
  judge_model "anthropic/claude-opus-4-5-20251101"
end

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.higher_is_better(value = nil) ⇒ Object

Gets or sets whether higher scores are better.

– : (?bool?) -> bool



34
35
36
37
38
39
40
41
# File 'lib/riffer/evals/evaluator.rb', line 34

def higher_is_better(value = nil)
  if value.nil?
    current = @higher_is_better
    return true if current.nil?
    return current
  end
  @higher_is_better = value
end

.instructions(value = nil) ⇒ Object

Gets or sets the evaluation instructions (criteria and scoring rubric).

– : (?String?) -> String?



25
26
27
28
# File 'lib/riffer/evals/evaluator.rb', line 25

def instructions(value = nil)
  return @instructions if value.nil?
  @instructions = value.to_s
end

.judge_model(value = nil) ⇒ Object

Gets or sets the judge model for LLM-as-judge evaluations.

– : (?String?) -> String?



47
48
49
50
# File 'lib/riffer/evals/evaluator.rb', line 47

def judge_model(value = nil)
  return @judge_model if value.nil?
  @judge_model = value.to_s
end

Instance Method Details

#evaluate(input:, output:, ground_truth: nil, messages: []) ⇒ Object

Evaluates an input/output pair. The default calls the judge with the class-level instructions; override for custom logic (e.g. rule-based evaluators). – : (input: String | Array[Hash[Symbol, untyped] | Riffer::Messages::Base], output: String, ?ground_truth: String?, ?messages: Array) -> Riffer::Evals::Result

Raises:

  • (NotImplementedError)


58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/riffer/evals/evaluator.rb', line 58

def evaluate(input:, output:, ground_truth: nil, messages: [])
  instr = self.class.instructions
  raise NotImplementedError, "#{self.class} must set instructions or implement #evaluate" unless instr

  evaluation = judge.evaluate(
    instructions: instr,
    input: format_input(input),
    output: output,
    ground_truth: ground_truth
  )

  result(score: evaluation[:score], reason: evaluation[:reason])
end