Class: CompletionKit::DashboardStats
- Inherits:
-
Object
- Object
- CompletionKit::DashboardStats
- Defined in:
- app/services/completion_kit/dashboard_stats.rb
Overview
Read-only aggregate queries powering the standalone dashboard cards. Each method is a small, scoped query — nothing here writes or caches.
Class Method Summary collapse
-
.activity(days: 14) ⇒ Object
Runs per calendar day for the trailing ‘days` window, oldest first.
-
.failures(since:) ⇒ Object
Everything that terminally failed in the window across all three surfaces — failed runs, failed generations, failed judge reviews — excluding any the user has dismissed.
-
.metric_average(metric_id, since:) ⇒ Object
The rounded average judge score for one metric across the window, or nil when it has no scored reviews.
-
.prompt_changes(limit: 5) ⇒ Object
The most recent measurable change per prompt family — gains and regressions both.
-
.worst_metric(since:) ⇒ Object
The metric with the lowest average judge score across succeeded reviews in the window — the prompt-engineering target.
Class Method Details
.activity(days: 14) ⇒ Object
Runs per calendar day for the trailing ‘days` window, oldest first. Always returns one entry per day (count 0 for quiet days) so callers can render a fixed-width sparkline.
8 9 10 11 12 13 14 15 16 17 |
# File 'app/services/completion_kit/dashboard_stats.rb', line 8 def self.activity(days: 14) since = (days - 1).days.ago.to_date counts = Run.where("created_at >= ?", since.beginning_of_day) .group("DATE(created_at)") .count (0...days).map do |offset| date = since + offset { date: date, count: counts[date] || counts[date.to_s] || 0 } end end |
.failures(since:) ⇒ Object
Everything that terminally failed in the window across all three surfaces — failed runs, failed generations, failed judge reviews —excluding any the user has dismissed. Returns a count and an items list ordered most-recent-first; each item carries its surface, the failing record, the run it belongs to (for a deep link), and a cause string.
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
# File 'app/services/completion_kit/dashboard_stats.rb', line 66 def self.failures(since:) dismissed = failure_dismissal_keys items = [] Run.where(status: "failed").where("created_at >= ?", since).find_each do |run| next if dismissed.include?(["CompletionKit::Run", run.id]) items << { surface: "run", record: run, run: run, cause: run.failure_summary.presence || "Run failed", at: run.updated_at } end Response.where(status: "failed").where("created_at >= ?", since) .includes(:run).find_each do |response| next if dismissed.include?(["CompletionKit::Response", response.id]) items << { surface: "generation", record: response, run: response.run, cause: failure_cause(response), at: response.updated_at } end Review.where(status: "failed").where("completion_kit_reviews.created_at >= ?", since) .includes(response: :run).find_each do |review| next if dismissed.include?(["CompletionKit::Review", review.id]) items << { surface: "judge", record: review, run: review.response.run, cause: failure_cause(review), at: review.updated_at } end items.sort_by! { |item| item[:at] } items.reverse! { count: items.size, items: items } end |
.metric_average(metric_id, since:) ⇒ Object
The rounded average judge score for one metric across the window, or nil when it has no scored reviews. Used to snapshot a dismissal’s baseline.
57 58 59 |
# File 'app/services/completion_kit/dashboard_stats.rb', line 57 def self.metric_average(metric_id, since:) scored_reviews_since(since).where(metric_id: metric_id).average(:ai_score)&.to_f&.round(2) end |
.prompt_changes(limit: 5) ⇒ Object
The most recent measurable change per prompt family — gains and regressions both. For each family the comparison is:
* latest scored version vs the published version, when a draft sits
ahead of what's live ("is my work-in-progress better?")
* published vs the previous scored version, when the latest version
IS the published one ("did my last publish help?")
Biggest movement first. Empty until something has been iterated and re-judged on both sides of the comparison.
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'app/services/completion_kit/dashboard_stats.rb', line 109 def self.prompt_changes(limit: 5) scores = Review.joins(response: :run) .where(status: "succeeded") .where.not(ai_score: nil) .group("completion_kit_runs.prompt_id") .average(:ai_score) return [] if scores.empty? Prompt.where(id: scores.keys).group_by(&:family_key).filter_map do |_key, versions| scored = versions.select { |v| scores[v.id] }.sort_by(&:version_number) next if scored.size < 2 candidate = scored.last published = versions.find(&:current?) baseline = if published && published != candidate && scores[published.id] published else scored[-2] end delta = (scores[candidate.id] - scores[baseline.id]).to_f.round(2) next if delta.zero? { prompt: candidate, from_version: baseline.version_number, to_version: candidate.version_number, from_score: scores[baseline.id].to_f.round(2), to_score: scores[candidate.id].to_f.round(2), delta: delta } end.sort_by { |row| -row[:delta].abs }.first(limit) end |
.worst_metric(since:) ⇒ Object
The metric with the lowest average judge score across succeeded reviews in the window — the prompt-engineering target. Dismissed metrics are skipped while their average holds at or above the score snapshotted when they were dismissed; a metric that regresses below that baseline resurfaces and its stale dismissal is cleared. Returns nil when nothing qualifies. ‘response` is the single worst-scoring response, for a deep link.
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'app/services/completion_kit/dashboard_stats.rb', line 26 def self.worst_metric(since:) averages = scored_reviews_since(since) .joins(:metric) .group("completion_kit_metrics.id") .average(:ai_score) return nil if averages.empty? dismissals = metric_dismissals metrics = Metric.where(id: averages.keys).index_by(&:id) averages.sort_by { |_id, avg| avg }.each do |metric_id, avg| rounded = avg.to_f.round(2) dismissal = dismissals[metric_id] next if dismissal && rounded >= dismissal.baseline_score.to_f dismissal&.destroy worst = scored_reviews_since(since).where(metric_id: metric_id).order(:ai_score).first metric = metrics[metric_id] return { metric: metric, name: metric.name, avg: rounded, response: worst.response, score: worst.ai_score.to_f } end nil end |