Class: Braintrust::Eval::Runner

Inherits:
Object
  • Object
show all
Defined in:
lib/braintrust/eval/runner.rb

Overview

Internal runner class that performs the execution of the Eval and returns the result. Receives a fully-normalized Context — all callables are already typed wrappers.

Defined Under Namespace

Classes: CaseContext

Constant Summary collapse

MAX_PARALLELISM =

Maximum parallelism allowed (mirrors Internal::ThreadPool::MAX_PARALLELISM)

Internal::ThreadPool::MAX_PARALLELISM

Instance Method Summary collapse

Constructor Details

#initialize(eval_context) ⇒ Runner

Returns a new instance of Runner.

Parameters:

  • eval_context (Context)

    Normalized eval context



26
27
28
29
30
31
32
# File 'lib/braintrust/eval/runner.rb', line 26

def initialize(eval_context)
  @eval_context = eval_context
  @tracer = eval_context.tracer_provider.tracer("braintrust-eval")

  # Mutex for thread-safe score collection
  @score_mutex = Mutex.new
end

Instance Method Details

#run(parallelism: 1) ⇒ Result

Run evaluation and return Result

Parameters:

  • parallelism (Integer) (defaults to: 1)

    Number of parallel workers (default: 1)

Returns:



37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# File 'lib/braintrust/eval/runner.rb', line 37

def run(parallelism: 1)
  start_time = Time.now
  eval_cases = eval_context.cases
  errors = Queue.new
  @scores = {} # Reset for each run: { scorer_name => Array<Numeric> }

  if parallelism && parallelism > 1
    Internal::ThreadPool.each(eval_cases, parallelism: parallelism) do |eval_case|
      run_eval_case(build_case_context(eval_case), errors)
    end
  else
    eval_cases.each do |eval_case|
      run_eval_case(build_case_context(eval_case), errors)
    end
  end

  # Convert Queue to Array after all threads complete
  error_array = [].tap { |a| a << errors.pop until errors.empty? }

  # Calculate duration
  duration = Time.now - start_time

  # Generate permalink (only when state and experiment are available)
  permalink = if eval_context.state && eval_context.experiment_id
    eval_context.state.object_permalink(object_type: "experiment", object_id: eval_context.experiment_id)
  end

  Result.new(
    experiment_id: eval_context.experiment_id,
    experiment_name: eval_context.experiment_name,
    project_id: eval_context.project_id,
    project_name: eval_context.project_name,
    permalink: permalink,
    errors: error_array,
    duration: duration,
    scores: @scores
  )
end