Module: Legion::LLM::ConfidenceScorer

Extended by:: Legion::Logging::Helper

Defined in:: lib/legion/llm/confidence_scorer.rb

Overview

Computes a ConfidenceScore for an LLM response using available signals.

Strategy selection (in priority order):

1. logprobs  — native model confidence from token log-probabilities (when available)
2. caller    — caller-provided score passed via options[:confidence_score]
3. heuristic — derived from response content characteristics

Band boundaries are read from Legion::Settings[:confidence] when Legion::Settings is available, otherwise the DEFAULT_BANDS constants are used. Per-call overrides can be passed as options.

Constant Summary collapse

DEFAULT_BANDS =

Default band boundaries. Keys are the lower boundary of that band name:

score < :low -> :very_low score < :medium -> :low score < :high -> :medium score < :very_high -> :high score >= :very_high -> :very_high

{
  low:       0.3,
  medium:    0.5,
  high:      0.7,
  very_high: 0.9
}.freeze

HEURISTIC_WEIGHTS = Penalty weights used in heuristic scoring.

{
  refusal:            -0.8,
  empty:              -1.0,
  truncated:          -0.4,
  repetition:         -0.5,
  json_parse_failure: -0.6,
  too_short:          -0.3
}.freeze

STRUCTURED_OUTPUT_BONUS = Bonus applied when structured output parse succeeds.

0.1

HEDGING_PATTERNS = Hedging language patterns that reduce confidence.

[
  /\b(?:I think|I believe|I'm not sure|I'm uncertain|it seems|it appears|maybe|perhaps|possibly|probably|I guess|I assume)\b/i,
  /\bnot (?:certain|sure|definite|confirmed)\b/i,
  /\bunclear\b/i,
  /\bcould be\b/i
].freeze

Class Method Summary collapse

.score(raw_response, **options) ⇒ Object

Compute a ConfidenceScore for the given raw_response.

Class Method Details

.score(raw_response, **options) ⇒ `Object`

Compute a ConfidenceScore for the given raw_response.

raw_response - the RubyLLM response object (must respond to #content) options - Hash:

:confidence_score  - Float  caller-provided score (bypasses heuristics)
:confidence_bands  - Hash   per-call band overrides
:json_expected     - Boolean whether JSON output was expected
:quality_result    - QualityResult from QualityChecker (optional, avoids re-running checks)

Returns a ConfidenceScore.