Module: Legion::LLM::ConfidenceScorer
- Extended by:
- Legion::Logging::Helper
- Defined in:
- lib/legion/llm/confidence_scorer.rb
Overview
Computes a ConfidenceScore for an LLM response using available signals.
Strategy selection (in priority order):
1. logprobs — native model confidence from token log-probabilities (when available)
2. caller — caller-provided score passed via options[:confidence_score]
3. heuristic — derived from response content characteristics
Band boundaries are read from Legion::Settings[:confidence] when Legion::Settings is available, otherwise the DEFAULT_BANDS constants are used. Per-call overrides can be passed as options.
Constant Summary collapse
- DEFAULT_BANDS =
Default band boundaries. Keys are the lower boundary of that band name:
score < :low -> :very_low score < :medium -> :low score < :high -> :medium score < :very_high -> :high score >= :very_high -> :very_high { low: 0.3, medium: 0.5, high: 0.7, very_high: 0.9 }.freeze
- HEURISTIC_WEIGHTS =
Penalty weights used in heuristic scoring.
{ refusal: -0.8, empty: -1.0, truncated: -0.4, repetition: -0.5, json_parse_failure: -0.6, too_short: -0.3 }.freeze
- STRUCTURED_OUTPUT_BONUS =
Bonus applied when structured output parse succeeds.
0.1- HEDGING_PATTERNS =
Hedging language patterns that reduce confidence.
[ /\b(?:I think|I believe|I'm not sure|I'm uncertain|it seems|it appears|maybe|perhaps|possibly|probably|I guess|I assume)\b/i, /\bnot (?:certain|sure|definite|confirmed)\b/i, /\bunclear\b/i, /\bcould be\b/i ].freeze
Class Method Summary collapse
-
.score(raw_response, **options) ⇒ Object
Compute a ConfidenceScore for the given raw_response.
Class Method Details
.score(raw_response, **options) ⇒ Object
Compute a ConfidenceScore for the given raw_response.
raw_response - the RubyLLM response object (must respond to #content) options - Hash:
:confidence_score - Float caller-provided score (bypasses heuristics)
:confidence_bands - Hash per-call band overrides
:json_expected - Boolean whether JSON output was expected
:quality_result - QualityResult from QualityChecker (optional, avoids re-running checks)
Returns a ConfidenceScore.
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
# File 'lib/legion/llm/confidence_scorer.rb', line 65 def score(raw_response, **) bands = resolve_bands([:confidence_bands]) if (caller_score = [:confidence_score]) return ConfidenceScore.build( score: caller_score.to_f, bands: bands, source: :caller_provided, signals: { caller_provided: caller_score.to_f } ) end if (lp = extract_logprobs(raw_response)) return ConfidenceScore.build( score: lp, bands: bands, source: :logprobs, signals: { avg_logprob: lp } ) end heuristic_score(raw_response, bands: bands, options: ) end |