Module: Riffer::Evals::EvaluatorRunner
- Extended by:
- EvaluatorRunner
- Included in:
- EvaluatorRunner
- Defined in:
- lib/riffer/evals/evaluator_runner.rb
Overview
Orchestrates running evaluators against an agent across multiple scenarios.
result = Riffer::Evals::EvaluatorRunner.run(
agent: MyAgent,
scenarios: [
{ input: "What is Ruby?", ground_truth: "A programming language" },
{ input: "What is Python?" }
],
evaluators: [AnswerRelevancyEvaluator]
)
result.scores # => { AnswerRelevancyEvaluator => 0.85 }
Instance Method Summary collapse
-
#run(agent:, scenarios:, evaluators:, context: nil) ⇒ Object
Runs evaluators against an agent for the given scenarios.
Instance Method Details
#run(agent:, scenarios:, evaluators:, context: nil) ⇒ Object
Runs evaluators against an agent for the given scenarios. Raises Riffer::ArgumentError on an invalid agent or evaluator. – : (agent: singleton(Riffer::Agent), scenarios: Array[Hash[Symbol, untyped]], evaluators: Array, ?context: Hash[Symbol, untyped]?) -> Riffer::Evals::RunResult
24 25 26 27 28 29 30 31 32 33 |
# File 'lib/riffer/evals/evaluator_runner.rb', line 24 def run(agent:, scenarios:, evaluators:, context: nil) validate_agent!(agent) validate_evaluators!(evaluators) scenario_results = scenarios.map do |scenario| run_scenario(agent: agent, scenario: scenario, evaluators: evaluators, context: context) end Riffer::Evals::RunResult.new(scenario_results: scenario_results) end |