Class: Riffer::Evals::ScenarioResult

Inherits:
Object
  • Object
show all
Defined in:
lib/riffer/evals/scenario_result.rb

Overview

Represents the result of evaluating a single scenario.

Contains the input, output, ground truth, and individual evaluator results.

scenario_result = Riffer::Evals::ScenarioResult.new(
  input: "What is Ruby?",
  output: "A programming language.",
  ground_truth: "A programming language",
  results: [result1, result2]
)

scenario_result.scores  # => { MyEvaluator => 0.85 }

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input:, output:, ground_truth:, results:, messages: []) ⇒ ScenarioResult

Initializes a new scenario result.

– : (input: String, output: String, ground_truth: String?, results: Array, ?messages: Array) -> void



37
38
39
40
41
42
43
# File 'lib/riffer/evals/scenario_result.rb', line 37

def initialize(input:, output:, ground_truth:, results:, messages: [])
  @input = input
  @output = output
  @ground_truth = ground_truth
  @results = results
  @messages = messages
end

Instance Attribute Details

#ground_truthObject (readonly)

The ground truth used during evaluation.



25
26
27
# File 'lib/riffer/evals/scenario_result.rb', line 25

def ground_truth
  @ground_truth
end

#inputObject (readonly)

The input that was evaluated.



19
20
21
# File 'lib/riffer/evals/scenario_result.rb', line 19

def input
  @input
end

#messagesObject (readonly)

The full message history from the agent conversation.



31
32
33
# File 'lib/riffer/evals/scenario_result.rb', line 31

def messages
  @messages
end

#outputObject (readonly)

The agent output for this scenario.



22
23
24
# File 'lib/riffer/evals/scenario_result.rb', line 22

def output
  @output
end

#resultsObject (readonly)

Individual evaluation results.



28
29
30
# File 'lib/riffer/evals/scenario_result.rb', line 28

def results
  @results
end

Instance Method Details

#scoresObject

Returns scores keyed by evaluator class.

– : () -> Hash[singleton(Riffer::Evals::Evaluator), Float]



49
50
51
52
53
# File 'lib/riffer/evals/scenario_result.rb', line 49

def scores
  results.each_with_object({}) do |result, hash|
    hash[result.evaluator] = result.score
  end
end

#to_hObject

Returns a hash representation of the scenario result.

– : () -> Hash[Symbol, untyped]



59
60
61
62
63
64
65
66
67
68
# File 'lib/riffer/evals/scenario_result.rb', line 59

def to_h
  {
    input: input,
    output: output,
    ground_truth: ground_truth,
    scores: scores.transform_keys(&:name),
    results: results.map(&:to_h),
    messages: messages.map(&:to_h)
  }
end