Class: LlmConductor::Eval::ModelRunner

Inherits:
Object
  • Object
show all
Defined in:
lib/llm_conductor/eval/model_runner.rb

Overview

Runs one (input, model) pair through LlmConductor.generate, capturing latency / tokens / cost / parse status and writing raw + parsed outputs through the Store. Side-effect free — never touches the caller’s data.

All feature-specific behavior (prompt type, payload, parsing, score/bucket extraction) is delegated to the Spec.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input, model:, vendor:, spec:, store:, run_id:, logger:, data: nil) ⇒ ModelRunner

Returns a new instance of ModelRunner.



19
20
21
22
23
24
25
26
27
28
# File 'lib/llm_conductor/eval/model_runner.rb', line 19

def initialize(input, model:, vendor:, spec:, store:, run_id:, logger:, data: nil)
  @input = input
  @model = model
  @vendor = vendor.to_sym
  @spec = spec
  @store = store
  @run_id = run_id
  @logger = logger
  @data = data
end

Class Method Details

.slug(model) ⇒ Object

Filesystem-safe slug for a model name (e.g. “gemini-2.5-flash”).



15
16
17
# File 'lib/llm_conductor/eval/model_runner.rb', line 15

def self.slug(model)
  model.to_s.gsub(/[^A-Za-z0-9_.-]+/, '_')
end

Instance Method Details

#runObject



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/llm_conductor/eval/model_runner.rb', line 30

def run
  input_id = @spec.input_id(@input)
  data = @data || @spec.build_data(@input)

  started_at = Time.now.utc
  response = LlmConductor.generate(**generate_args(data, input_id))
  latency_ms = ((Time.now.utc - started_at) * 1000).round

  raw_ref = @store.write_raw(@run_id, input_id, slug, response&.output.to_s)

  if response.nil? || !response.success?
    error = response&.&.dig(:error) || 'LLM returned no response'
    return build_result(input_id:, status: 'llm_error', latency_ms:, response:, raw_ref:, error:)
  end

  parsed = @spec.parse(response.output)
  if parsed.nil?
    return build_result(input_id:, status: 'parse_error', latency_ms:, response:, raw_ref:,
                        error: 'LLM output not valid structured data')
  end

  parsed_ref = @store.write_parsed(@run_id, input_id, slug, parsed)
  build_result(input_id:, status: 'ok', latency_ms:, response:, raw_ref:, parsed_ref:, parsed:)
rescue StandardError => e
  @logger.error("[Eval::ModelRunner] #{@model}@#{@spec.input_id(@input)}: #{e.class}: #{e.message}")
  Result.new(input_id: @spec.input_id(@input), input_label:, model: @model,
             vendor: @vendor, status: 'exception', error: "#{e.class}: #{e.message}")
end

#slugObject



59
60
61
# File 'lib/llm_conductor/eval/model_runner.rb', line 59

def slug
  self.class.slug(@model)
end