Module: Qualspec
- Defined in:
- lib/qualspec.rb,
lib/qualspec.rb,
lib/qualspec/judge.rb,
lib/qualspec/rspec.rb,
lib/qualspec/client.rb,
lib/qualspec/rubric.rb,
lib/qualspec/version.rb,
lib/qualspec/recorder.rb,
lib/qualspec/suite/dsl.rb,
lib/qualspec/evaluation.rb,
lib/qualspec/suite/runner.rb,
lib/qualspec/configuration.rb,
lib/qualspec/rspec/helpers.rb,
lib/qualspec/model_registry.rb,
lib/qualspec/prompt_variant.rb,
lib/qualspec/rspec/matchers.rb,
lib/qualspec/suite/behavior.rb,
lib/qualspec/suite/reporter.rb,
lib/qualspec/suite/scenario.rb,
lib/qualspec/builtin_rubrics.rb,
lib/qualspec/suite/candidate.rb,
lib/qualspec/rspec/configuration.rb,
lib/qualspec/suite/html_reporter.rb,
lib/qualspec/rspec/evaluation_result.rb,
lib/qualspec/suite/builtin_behaviors.rb,
sig/qualspec.rbs
Defined Under Namespace
Modules: BuiltinRubrics, RSpec, Suite Classes: Client, Configuration, Error, Evaluation, Judge, ModelRegistry, PromptVariant, Recorder, Rubric
Constant Summary collapse
- VERSION =
'0.2.0'
Class Method Summary collapse
- .client ⇒ Object
- .configuration ⇒ Object
- .configure {|configuration| ... } ⇒ Object
-
.define_behavior(name, &block) ⇒ Object
Top-level convenience method.
-
.define_rubric(name, &block) ⇒ Object
Convenience method for defining rubrics.
-
.evaluation(name, &block) ⇒ Object
Top-level convenience method.
- .judge ⇒ Object
-
.model(name = nil) ⇒ Object
Resolve a named model to its slug, falling back to the default (openrouter/auto).
-
.models ⇒ Object
Registry of named models loaded from config/models.yml (or QUALSPEC_MODELS_FILE).
- .reset! ⇒ Object
-
.run(suite_name, progress: true, output: :stdout, json_path: nil, html_path: nil, show_responses: false, load_builtins: true) ⇒ Object
Run an evaluation suite.
Class Method Details
.client ⇒ Object
47 48 49 |
# File 'lib/qualspec.rb', line 47 def client @client ||= Client.new(configuration) end |
.configuration ⇒ Object
29 30 31 |
# File 'lib/qualspec.rb', line 29 def configuration @configuration ||= Configuration.new end |
.configure {|configuration| ... } ⇒ Object
33 34 35 |
# File 'lib/qualspec.rb', line 33 def configure yield(configuration) end |
.define_behavior(name, &block) ⇒ Object
Top-level convenience method
76 77 78 |
# File 'lib/qualspec.rb', line 76 def define_behavior(name, &block) Suite::Behavior.define(name, &block) end |
.define_rubric(name, &block) ⇒ Object
Convenience method for defining rubrics
71 72 73 |
# File 'lib/qualspec.rb', line 71 def define_rubric(name, &block) Rubric.define(name, &block) end |
.evaluation(name, &block) ⇒ Object
Top-level convenience method
81 82 83 |
# File 'lib/qualspec.rb', line 81 def evaluation(name, &block) Suite.define(name, &block) end |
.model(name = nil) ⇒ Object
66 67 68 |
# File 'lib/qualspec.rb', line 66 def model(name = nil) models.resolve(name) end |
.models ⇒ Object
Registry of named models loaded from config/models.yml (or QUALSPEC_MODELS_FILE). See ModelRegistry.
57 58 59 |
# File 'lib/qualspec.rb', line 57 def models @models ||= ModelRegistry.new end |
.reset! ⇒ Object
37 38 39 40 41 42 43 44 45 |
# File 'lib/qualspec.rb', line 37 def reset! @configuration = nil @client = nil @judge = nil @models = nil Rubric.clear! Suite.clear! Suite::Behavior.clear! end |
.run(suite_name, progress: true, output: :stdout, json_path: nil, html_path: nil, show_responses: false, load_builtins: true) ⇒ Object
Run an evaluation suite
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
# File 'lib/qualspec.rb', line 86 def run(suite_name, progress: true, output: :stdout, json_path: nil, html_path: nil, show_responses: false, load_builtins: true) # Load builtins (idempotent, can be called multiple times) if load_builtins BuiltinRubrics.load! Suite::BuiltinBehaviors.load! end suite = Suite.find(suite_name) runner = Suite::Runner.new(suite) results = runner.run(progress: progress) results.finish! reporter = Suite::Reporter.new(results, show_responses: show_responses) case output when :stdout puts reporter.to_stdout when :json puts reporter.to_json when :silent # nothing end reporter.write_json(json_path) if json_path if html_path html_reporter = Suite::HtmlReporter.new(results) html_reporter.write(html_path) end results end |