Class: Braintrust::Eval::Evaluator

Inherits:
Object
  • Object
show all
Defined in:
lib/braintrust/eval/evaluator.rb

Overview

Base class for evaluators. Subclass and override #task and #scorers, or instantiate directly with keyword arguments.

Evaluators are used with the dev server, which reports scorer names to the Braintrust UI. Always use named scorers (via Scorer.new or subclass) so they display meaningfully.

Examples:

Subclass pattern

class FoodClassifier < Braintrust::Eval::Evaluator
  def task
    ->(input:) { classify(input) }
  end

  def scorers
    [Braintrust::Scorer.new("exact_match") { |expected:, output:| output == expected ? 1.0 : 0.0 }]
  end
end

Inline pattern

Braintrust::Eval::Evaluator.new(
  task: ->(input:) { input.upcase },
  scorers: [
    Braintrust::Scorer.new("exact_match") { |expected:, output:| output == expected ? 1.0 : 0.0 }
  ]
)

Remote eval with parameters (for Playground UI)

Braintrust::Eval::Evaluator.new(
  task: ->(input:, parameters:) {
    model = parameters["model"] || "gpt-4"
    # Use model to generate response...
  },
  scorers: [Braintrust::Scorer.new("exact") { |expected:, output:| output == expected ? 1.0 : 0.0 }],
  parameters: {
    "model" => {type: "string", default: "gpt-4", description: "Model to use"}
  }
)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(task: nil, scorers: [], parameters: {}) ⇒ Evaluator

Returns a new instance of Evaluator.



45
46
47
48
49
# File 'lib/braintrust/eval/evaluator.rb', line 45

def initialize(task: nil, scorers: [], parameters: {})
  @task = task
  @scorers = scorers
  @parameters = parameters
end

Instance Attribute Details

#parametersObject

Returns the value of attribute parameters.



43
44
45
# File 'lib/braintrust/eval/evaluator.rb', line 43

def parameters
  @parameters
end

#scorersObject

Returns the value of attribute scorers.



43
44
45
# File 'lib/braintrust/eval/evaluator.rb', line 43

def scorers
  @scorers
end

#taskObject

Returns the value of attribute task.



43
44
45
# File 'lib/braintrust/eval/evaluator.rb', line 43

def task
  @task
end

Instance Method Details

#run(cases, on_progress: nil, quiet: false, project: nil, experiment: nil, project_id: nil, dataset: nil, scorers: nil, parent: nil, state: nil, update: false, tracer_provider: nil, parameters: nil) ⇒ Result

Run this evaluator against the given cases. Delegates to Braintrust::Eval.run with the evaluator’s task and scorers.

Parameters:

  • cases (Array)

    The test cases

  • on_progress (#call, nil) (defaults to: nil)

    Optional callback fired after each test case

  • quiet (Boolean) (defaults to: false)

    If true, suppress result output (default: false)

  • project (String, nil) (defaults to: nil)

    Project name

  • experiment (String, nil) (defaults to: nil)

    Experiment name

  • project_id (String, nil) (defaults to: nil)

    Project UUID (skips project creation)

  • dataset (String, Hash, Dataset, Dataset::ID, nil) (defaults to: nil)

    Dataset to fetch

  • scorers (Array, nil) (defaults to: nil)

    Additional scorers (merged with evaluator’s own)

  • parent (Hash, nil) (defaults to: nil)

    Parent span context

  • state (State, nil) (defaults to: nil)

    Braintrust state

  • update (Boolean) (defaults to: false)

    If true, allow reusing existing experiment (default: false)

  • tracer_provider (TracerProvider, nil) (defaults to: nil)

    OpenTelemetry tracer provider (defaults to global)

Returns:



76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/braintrust/eval/evaluator.rb', line 76

def run(cases, on_progress: nil, quiet: false,
  project: nil, experiment: nil, project_id: nil,
  dataset: nil, scorers: nil, parent: nil,
  state: nil, update: false, tracer_provider: nil,
  parameters: nil)
  all_scorers = scorers ? self.scorers + scorers : self.scorers
  Braintrust::Eval.run(
    task: task, scorers: all_scorers, cases: cases, dataset: dataset,
    project: project, experiment: experiment, project_id: project_id,
    parent: parent, on_progress: on_progress, quiet: quiet,
    state: state, update: update, tracer_provider: tracer_provider,
    parameters: parameters
  )
end

#validate!Object

Validate that the evaluator has required fields set.

Raises:

  • (ArgumentError)

    if validation fails



53
54
55
56
57
58
# File 'lib/braintrust/eval/evaluator.rb', line 53

def validate!
  raise ArgumentError, "task is required" unless task
  unless task.respond_to?(:call)
    raise ArgumentError, "task must be callable (respond to :call)"
  end
end