Class: LlmConductor::Eval::Spec
- Inherits:
-
Object
- Object
- LlmConductor::Eval::Spec
- Defined in:
- lib/llm_conductor/eval/spec.rb
Overview
The public extension seam. Subclass (or duck-type) this to describe how to evaluate one LLM-powered feature: how to turn a caller-supplied input into a prompt payload, how to parse the output, and what the judge should grade.
The engine itself is generic and feature-agnostic; everything feature-specific lives here. Unlike the Rails prototype’s Feature::Base, there is no select_cases — selecting which inputs to evaluate is the caller’s job, done before calling LlmConductor::Eval.run and passed via inputs:. The engine never queries a database.
Instance Method Summary collapse
-
#build_data(_input) ⇒ Object
Build the prompt payload for one input.
-
#extra_columns(_parsed) ⇒ Object
Extra per-row CSV columns beyond the base set.
-
#input_id(_input) ⇒ Object
Stable id for an input (was record.id).
-
#input_label(input) ⇒ Object
Human label for an input (was record.name).
-
#judge_dimensions ⇒ Object
- { key:, description: }
-
— dimensions the judge scores 0-100 each.
-
#judge_rubric_excerpt ⇒ Object
Text inlined into the judge prompt describing the rubric the candidate was asked to follow.
-
#output_summary(_parsed) ⇒ Object
{ score: Numeric|nil, bucket: String|nil } — powers CSV columns and the bucket-disagreement detection.
-
#parse(raw) ⇒ Object
Parse the LLM’s raw text into a Hash, or nil on failure.
-
#prompt_type ⇒ Object
Symbol passed to LlmConductor.generate as
type:(must match a registered prompt). -
#vendor_params(vendor:, input_id:) ⇒ Object
Vendor-specific generation params (e.g. a deterministic Ollama seed).
Instance Method Details
#build_data(_input) ⇒ Object
Build the prompt payload for one input. When #prompt_type is set this is passed as data:; otherwise it must be a full prompt String passed as prompt: (was build_data(record)).
37 38 39 |
# File 'lib/llm_conductor/eval/spec.rb', line 37 def build_data(_input) raise NotImplementedError end |
#extra_columns(_parsed) ⇒ Object
Extra per-row CSV columns beyond the base set. Keys become headers.
73 74 75 |
# File 'lib/llm_conductor/eval/spec.rb', line 73 def extra_columns(_parsed) {} end |
#input_id(_input) ⇒ Object
Stable id for an input (was record.id). Used for output grouping/paths.
25 26 27 |
# File 'lib/llm_conductor/eval/spec.rb', line 25 def input_id(_input) raise NotImplementedError end |
#input_label(input) ⇒ Object
Human label for an input (was record.name). Defaults to the id.
30 31 32 |
# File 'lib/llm_conductor/eval/spec.rb', line 30 def input_label(input) input_id(input).to_s end |
#judge_dimensions ⇒ Object
- { key:, description: }
-
— dimensions the judge scores 0-100 each.
68 69 70 |
# File 'lib/llm_conductor/eval/spec.rb', line 68 def judge_dimensions raise NotImplementedError end |
#judge_rubric_excerpt ⇒ Object
Text inlined into the judge prompt describing the rubric the candidate was asked to follow.
63 64 65 |
# File 'lib/llm_conductor/eval/spec.rb', line 63 def judge_rubric_excerpt raise NotImplementedError end |
#output_summary(_parsed) ⇒ Object
{ score: Numeric|nil, bucket: String|nil } — powers CSV columns and the bucket-disagreement detection. bucket may be any discrete label.
57 58 59 |
# File 'lib/llm_conductor/eval/spec.rb', line 57 def output_summary(_parsed) raise NotImplementedError end |
#parse(raw) ⇒ Object
Parse the LLM’s raw text into a Hash, or nil on failure. Defaults to the gem’s conservative JsonParser; override for tuned/feature-specific parsing.
43 44 45 |
# File 'lib/llm_conductor/eval/spec.rb', line 43 def parse(raw) JsonParser.parse(raw) end |
#prompt_type ⇒ Object
Symbol passed to LlmConductor.generate as type: (must match a registered prompt). Return nil if instead you build a full prompt string in #build_data, in which case the engine passes it as prompt:.
20 21 22 |
# File 'lib/llm_conductor/eval/spec.rb', line 20 def prompt_type raise NotImplementedError end |
#vendor_params(vendor:, input_id:) ⇒ Object
Vendor-specific generation params (e.g. a deterministic Ollama seed). Return {} for vendors that don’t expose one. rubocop:disable Lint/UnusedMethodArgument
50 51 52 |
# File 'lib/llm_conductor/eval/spec.rb', line 50 def vendor_params(vendor:, input_id:) {} end |