Module: Woods::Evaluation::Metrics
- Defined in:
- lib/woods/evaluation/metrics.rb
Overview
Retrieval quality metrics.
All methods are stateless pure functions that take arrays of identifiers and return numeric scores.
Class Method Summary collapse
-
.context_completeness(retrieved, required) ⇒ Float
Fraction of required units present in retrieved results.
-
.mrr(retrieved, relevant) ⇒ Float
Mean Reciprocal Rank — inverse of the rank of the first relevant result.
-
.precision_at_k(retrieved, relevant, cutoff: 5) ⇒ Float
Fraction of top-k results that are relevant.
-
.recall(retrieved, relevant) ⇒ Float
Fraction of relevant items that were retrieved.
-
.token_efficiency(relevant_tokens, total_tokens) ⇒ Float
Ratio of relevant tokens to total tokens in context.
Class Method Details
.context_completeness(retrieved, required) ⇒ Float
Fraction of required units present in retrieved results.
62 63 64 65 66 67 68 |
# File 'lib/woods/evaluation/metrics.rb', line 62 def context_completeness(retrieved, required) return 1.0 if required.empty? retrieved_set = retrieved.to_set found = required.count { |id| retrieved_set.include?(id) } found.to_f / required.size end |
.mrr(retrieved, relevant) ⇒ Float
Mean Reciprocal Rank — inverse of the rank of the first relevant result.
49 50 51 52 53 54 55 |
# File 'lib/woods/evaluation/metrics.rb', line 49 def mrr(retrieved, relevant) relevant_set = relevant.to_set retrieved.each_with_index do |id, idx| return 1.0 / (idx + 1) if relevant_set.include?(id) end 0.0 end |
.precision_at_k(retrieved, relevant, cutoff: 5) ⇒ Float
Fraction of top-k results that are relevant.
19 20 21 22 23 24 25 26 27 28 29 |
# File 'lib/woods/evaluation/metrics.rb', line 19 def precision_at_k(retrieved, relevant, cutoff: 5) return 0.0 if retrieved.empty? || relevant.empty? top_k = retrieved.first(cutoff) relevant_set = relevant.to_set hits = top_k.count { |id| relevant_set.include?(id) } # Divide by actual slice size, not the cutoff — when fewer than # `cutoff` items are retrieved, dividing by `cutoff` understates # precision (returns 0.2 for 1-of-1 at cutoff=5 instead of 1.0). hits.to_f / top_k.size end |
.recall(retrieved, relevant) ⇒ Float
Fraction of relevant items that were retrieved.
36 37 38 39 40 41 42 |
# File 'lib/woods/evaluation/metrics.rb', line 36 def recall(retrieved, relevant) return 0.0 if relevant.empty? retrieved_set = retrieved.to_set found = relevant.count { |id| retrieved_set.include?(id) } found.to_f / relevant.size end |
.token_efficiency(relevant_tokens, total_tokens) ⇒ Float
Ratio of relevant tokens to total tokens in context.
75 76 77 78 79 |
# File 'lib/woods/evaluation/metrics.rb', line 75 def token_efficiency(relevant_tokens, total_tokens) return 0.0 if total_tokens.zero? [relevant_tokens.to_f / total_tokens, 1.0].min end |