Class: Riffer::Evals::Judge
- Inherits:
-
Object
- Object
- Riffer::Evals::Judge
- Defined in:
- lib/riffer/evals/judge.rb
Overview
Executes LLM-as-judge evaluations, using tool calling internally to get structured output from the judge model.
Defined Under Namespace
Classes: EvaluationTool
Instance Attribute Summary collapse
-
#model ⇒ Object
readonly
The model string (provider/model format).
Instance Method Summary collapse
-
#evaluate(instructions:, input:, output:, ground_truth: nil) ⇒ Object
Evaluates an input/output pair using the configured LLM.
-
#initialize(model:, provider_options: {}) ⇒ Judge
constructor
Raises Riffer::ArgumentError unless
modelis “provider/model” format.
Constructor Details
#initialize(model:, provider_options: {}) ⇒ Judge
Raises Riffer::ArgumentError unless model is “provider/model” format. – : (model: String, ?provider_options: Hash[Symbol, untyped]) -> void
37 38 39 40 41 42 43 44 45 |
# File 'lib/riffer/evals/judge.rb', line 37 def initialize(model:, provider_options: {}) provider_name, model_name = model.split("/", 2) unless [provider_name, model_name].all? { |part| part.is_a?(String) && !part.strip.empty? } raise Riffer::ArgumentError, "Invalid model string: #{model}" end @model = model @provider_options = end |
Instance Attribute Details
#model ⇒ Object (readonly)
The model string (provider/model format).
32 33 34 |
# File 'lib/riffer/evals/judge.rb', line 32 def model @model end |
Instance Method Details
#evaluate(instructions:, input:, output:, ground_truth: nil) ⇒ Object
Evaluates an input/output pair using the configured LLM. – : (instructions: String, input: String, output: String, ?ground_truth: String?) -> Hash[Symbol, untyped]
50 51 52 53 54 55 56 57 58 59 60 61 62 |
# File 'lib/riffer/evals/judge.rb', line 50 def evaluate(instructions:, input:, output:, ground_truth: nil) = (instructions) = (input: input, output: output, ground_truth: ground_truth) response = provider_instance.generate_text( system: , prompt: , model: model_name, tools: [EvaluationTool] ) parse_tool_response(response) end |