Class: CompletionKit::DashboardStats

Inherits:
Object
  • Object
show all
Defined in:
app/services/completion_kit/dashboard_stats.rb

Overview

Read-only aggregate queries powering the standalone dashboard cards. Each method is a small, scoped query — nothing here writes or caches.

Class Method Summary collapse

Class Method Details

.activity(days: 14) ⇒ Object

Runs per calendar day for the trailing ‘days` window, oldest first. Always returns one entry per day (count 0 for quiet days) so callers can render a fixed-width sparkline.



8
9
10
11
12
13
14
15
16
17
# File 'app/services/completion_kit/dashboard_stats.rb', line 8

def self.activity(days: 14)
  since = (days - 1).days.ago.to_date
  counts = Run.where("created_at >= ?", since.beginning_of_day)
              .group("DATE(created_at)")
              .count
  (0...days).map do |offset|
    date = since + offset
    { date: date, count: counts[date] || counts[date.to_s] || 0 }
  end
end

.failures(since:) ⇒ Object

Everything that terminally failed in the window across all three surfaces — failed runs, failed generations, failed judge reviews —excluding any the user has dismissed. Returns a count and an items list ordered most-recent-first; each item carries its surface, the failing record, the run it belongs to (for a deep link), and a cause string.



66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# File 'app/services/completion_kit/dashboard_stats.rb', line 66

def self.failures(since:)
  dismissed = failure_dismissal_keys
  items = []

  Run.where(status: "failed").where("created_at >= ?", since).find_each do |run|
    next if dismissed.include?(["CompletionKit::Run", run.id])
    items << {
      surface: "run", record: run, run: run,
      cause: run.failure_summary.presence || "Run failed", at: run.updated_at
    }
  end

  Response.where(status: "failed").where("created_at >= ?", since)
          .includes(:run).find_each do |response|
    next if dismissed.include?(["CompletionKit::Response", response.id])
    items << {
      surface: "generation", record: response, run: response.run,
      cause: failure_cause(response), at: response.updated_at
    }
  end

  Review.where(status: "failed").where("completion_kit_reviews.created_at >= ?", since)
        .includes(response: :run).find_each do |review|
    next if dismissed.include?(["CompletionKit::Review", review.id])
    items << {
      surface: "judge", record: review, run: review.response.run,
      cause: failure_cause(review), at: review.updated_at
    }
  end

  items.sort_by! { |item| item[:at] }
  items.reverse!
  { count: items.size, items: items }
end

.metric_average(metric_id, since:) ⇒ Object

The rounded average judge score for one metric across the window, or nil when it has no scored reviews. Used to snapshot a dismissal’s baseline.



57
58
59
# File 'app/services/completion_kit/dashboard_stats.rb', line 57

def self.metric_average(metric_id, since:)
  scored_reviews_since(since).where(metric_id: metric_id).average(:ai_score)&.to_f&.round(2)
end

.prompt_changes(limit: 5) ⇒ Object

The most recent measurable change per prompt family — gains and regressions both. For each family the comparison is:

* latest scored version vs the published version, when a draft sits
  ahead of what's live ("is my work-in-progress better?")
* published vs the previous scored version, when the latest version
  IS the published one ("did my last publish help?")

Biggest movement first. Empty until something has been iterated and re-judged on both sides of the comparison.



109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# File 'app/services/completion_kit/dashboard_stats.rb', line 109

def self.prompt_changes(limit: 5)
  scores = Review.joins(response: :run)
                 .where(status: "succeeded")
                 .where.not(ai_score: nil)
                 .group("completion_kit_runs.prompt_id")
                 .average(:ai_score)
  return [] if scores.empty?

  Prompt.where(id: scores.keys).group_by(&:family_key).filter_map do |_key, versions|
    scored = versions.select { |v| scores[v.id] }.sort_by(&:version_number)
    next if scored.size < 2

    candidate = scored.last
    published = versions.find(&:current?)
    baseline =
      if published && published != candidate && scores[published.id]
        published
      else
        scored[-2]
      end

    delta = (scores[candidate.id] - scores[baseline.id]).to_f.round(2)
    next if delta.zero?

    {
      prompt: candidate,
      from_version: baseline.version_number,
      to_version: candidate.version_number,
      from_score: scores[baseline.id].to_f.round(2),
      to_score: scores[candidate.id].to_f.round(2),
      delta: delta
    }
  end.sort_by { |row| -row[:delta].abs }.first(limit)
end

.worst_metric(since:) ⇒ Object

The metric with the lowest average judge score across succeeded reviews in the window — the prompt-engineering target. Dismissed metrics are skipped while their average holds at or above the score snapshotted when they were dismissed; a metric that regresses below that baseline resurfaces and its stale dismissal is cleared. Returns nil when nothing qualifies. ‘response` is the single worst-scoring response, for a deep link.



26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# File 'app/services/completion_kit/dashboard_stats.rb', line 26

def self.worst_metric(since:)
  averages = scored_reviews_since(since)
             .joins(:metric)
             .group("completion_kit_metrics.id")
             .average(:ai_score)
  return nil if averages.empty?

  dismissals = metric_dismissals
  metrics = Metric.where(id: averages.keys).index_by(&:id)

  averages.sort_by { |_id, avg| avg }.each do |metric_id, avg|
    rounded = avg.to_f.round(2)
    dismissal = dismissals[metric_id]
    next if dismissal && rounded >= dismissal.baseline_score.to_f

    dismissal&.destroy
    worst = scored_reviews_since(since).where(metric_id: metric_id).order(:ai_score).first
    metric = metrics[metric_id]
    return {
      metric: metric,
      name: metric.name,
      avg: rounded,
      response: worst.response,
      score: worst.ai_score.to_f
    }
  end
  nil
end