Class: Rigor::Analysis::Baseline

Inherits:
Object
  • Object
show all
Defined in:
lib/rigor/analysis/baseline.rb

Overview

ADR-22 Slice 1 — PHPStan-shaped per-project baseline.

Loads ‘.rigor-baseline.yml`, filters a current run’s diagnostic stream against the recorded buckets, and emits an ‘(surfaced, silenced_count)` pair for the CLI to render.

Two row shapes are accepted (WD1):

# rule-ID row — bucket key (path, qualified_rule)
- file: app/models/user.rb
  rule: call.undefined-method
  count: 3

# message-pattern row — bucket key
#   (path, qualified_rule, message_regex)
- file: app/lib/sig.rb
  rule: call.undefined-method
  message: "undefined method `merge' for Array"
  count: 1

## Semantics per (file, rule [, message]) bucket (WD4)

actual <= count    → ALL diagnostics in the bucket are silenced.
actual >  count    → ALL diagnostics in the bucket surface
                     (not just the excess delta — the bucket
                     has crossed its threshold; the team's
                     review focus shifts from "which N is new"
                     to "what's going on with this rule in
                     this file as a whole").

## Filter pipeline position (WD6)

The baseline filter runs LAST among the diagnostic-suppression layers:

emit →  `# rigor:disable` (per-line)
     →  `# rigor:disable-file`
     →  severity_profile re-stamp
     →  baseline filter (this class)
     →  output

## Loading (WD2 (b))

‘Baseline.load` is called by the CLI when it has resolved an explicit baseline path (from `–baseline=PATH` on the CLI or `baseline: <path>` in `.rigor.yml`). The presence of `.rigor-baseline.yml` on disk alone never triggers a load — that’s the CLI / Configuration’s job to enforce.

## Path handling

Baselines store file paths **relative to the project root** (the working directory when ‘rigor` is run). This makes the generated `.rigor-baseline.yml` portable across machines and checkout locations. When filtering a live diagnostic stream, the instance normalises each diagnostic’s absolute path to a relative one before the bucket lookup.

Defined Under Namespace

Classes: Bucket, DriftRow, LoadError

Constant Summary collapse

CURRENT_VERSION =
1

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(buckets, project_root: Dir.pwd) ⇒ Baseline

Returns a new instance of Baseline.



200
201
202
203
204
205
206
207
208
209
210
211
212
# File 'lib/rigor/analysis/baseline.rb', line 200

def initialize(buckets, project_root: Dir.pwd)
  @project_root = Pathname.new(project_root)
  @buckets = buckets.freeze
  # For each (file, qualified_rule) pair, two arrays:
  # - rule-ID rows (message_regex == nil)
  # - message-pattern rows (message_regex != nil)
  # The matcher walks message-pattern rows first (tighter
  # match takes precedence); diagnostics that don't match
  # any message row fall through to the rule-ID row if
  # one exists.
  @by_pair = buckets.group_by { |b| [b.file, b.rule] }.freeze
  freeze
end

Instance Attribute Details

#bucketsObject (readonly)

Returns the value of attribute buckets.



198
199
200
# File 'lib/rigor/analysis/baseline.rb', line 198

def buckets
  @buckets
end

Class Method Details

.from_diagnostics(diagnostics, match_mode: :rule, project_root: Dir.pwd) ⇒ Object

Build a baseline from a current run’s diagnostic stream. ‘match_mode:` is `:rule` (default) or `:message`. The message-mode generator passes literal messages through `Regexp.escape` so generated rows never accidentally over-match on punctuation.

‘project_root:` is used to convert absolute diagnostic paths to relative paths in the generated YAML. Defaults to `Dir.pwd`.

Raises:

  • (ArgumentError)


102
103
104
105
106
107
108
109
110
111
112
113
114
115
# File 'lib/rigor/analysis/baseline.rb', line 102

def from_diagnostics(diagnostics, match_mode: :rule, project_root: Dir.pwd)
  raise ArgumentError, "match_mode must be :rule or :message" unless %i[rule message].include?(match_mode)

  grouped = group_for_baseline(diagnostics, match_mode, project_root)
  buckets = grouped.map do |key, entries|
    Bucket.new(
      file: key[0],
      rule: key[1],
      message_regex: key[2],
      count: entries.size
    )
  end
  new(buckets, project_root: project_root)
end

.load(path, project_root: Dir.pwd) ⇒ Object

Load a baseline file from disk. Returns ‘nil` when the path is nil (the caller’s “no baseline configured” state). Raises LoadError on malformed content; callers translate to a user-facing diagnostic.

‘project_root:` is the working directory against which stored relative paths are resolved during filtering. Defaults to `Dir.pwd`.



85
86
87
88
89
90
91
# File 'lib/rigor/analysis/baseline.rb', line 85

def load(path, project_root: Dir.pwd)
  return nil if path.nil?
  return new([], project_root: project_root) unless File.exist?(path)

  raw = YAML.safe_load_file(path, permitted_classes: [Symbol])
  parse_loaded(raw, path: path, project_root: project_root)
end

Instance Method Details

#audit(diagnostics) ⇒ Array<DriftRow>

Walk the current diagnostic stream and report bucket-level drift. Each baseline bucket becomes one DriftRow regardless of whether the current run still matches it.

Parameters:

  • diagnostics (Array<Diagnostic>)

    current run’s diagnostic stream (PRE-filter — pass the raw ‘result.diagnostics` from `Runner#run`, not the post-baseline surface).

Returns:

  • (Array<DriftRow>)

    one entry per baseline bucket, in baseline-file order.



275
276
277
278
279
280
281
282
283
284
285
286
287
288
# File 'lib/rigor/analysis/baseline.rb', line 275

def audit(diagnostics)
  counts = Hash.new(0)
  diagnostics.each do |diag|
    next if diag.qualified_rule.nil? || diag.path.nil?

    bucket = claim_bucket_for(diag)
    counts[bucket_key(bucket)] += 1 if bucket
  end

  buckets.map do |bucket|
    actual = counts[bucket_key(bucket)]
    DriftRow.new(bucket: bucket, actual_count: actual, status: status_for(actual, bucket.count))
  end
end

#empty?Boolean

Returns:

  • (Boolean)


320
321
322
# File 'lib/rigor/analysis/baseline.rb', line 320

def empty?
  buckets.empty?
end

#filter(diagnostics) ⇒ Object

Apply the baseline filter to a diagnostic stream.

Returns a 2-tuple:

  • ‘surfaced` — the diagnostics that survived the filter (new findings + entire over-threshold buckets).

  • ‘silenced_count` — how many diagnostics the baseline suppressed (for the WD7 stderr summary line).



221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
# File 'lib/rigor/analysis/baseline.rb', line 221

def filter(diagnostics)
  return [diagnostics, 0] if buckets.empty?

  grouped = group_diagnostics_for_filtering(diagnostics)
  surfaced = []
  silenced_count = 0

  grouped.each_value do |entries|
    bucket = entries[:bucket]
    diags = entries[:diagnostics]
    # No matching bucket → all surface as new findings.
    # `actual <= count` → all silenced (within threshold,
    # WD4). `actual >  count` → all surface (over
    # threshold, WD4).
    if bucket && diags.size <= bucket.count
      silenced_count += diags.size
    else
      surfaced.concat(diags)
    end
  end

  # Diagnostics that lacked a rule or a path bypass the
  # baseline entirely (the baseline can't address them).
  unkeyable = diagnostics.reject { |d| d.qualified_rule && d.path }
  [surfaced + unkeyable, silenced_count]
end

#sizeObject

The number of buckets recorded. Useful for the CLI summary on ‘generate`.



316
317
318
# File 'lib/rigor/analysis/baseline.rb', line 316

def size
  buckets.size
end

#to_yamlObject

Serialise to a YAML string. The generator path writes this through ‘File.write`; the dump format is stable across versions of this class as long as the bucket shape is unchanged.



302
303
304
305
306
307
308
309
310
311
312
# File 'lib/rigor/analysis/baseline.rb', line 302

def to_yaml
  rows = buckets.map do |bucket|
    row = { "file" => bucket.file, "rule" => bucket.rule }
    row["message"] = bucket.message_regex.source if bucket.message_regex
    row["count"] = bucket.count
    row
  end

  document = { "version" => CURRENT_VERSION, "ignored" => rows }
  YAML.dump(document)
end

#without(buckets_to_drop) ⇒ Object

Returns a new Baseline with the given buckets dropped. Used by ‘rigor baseline prune` (slice 2) to remove cleared buckets (`actual == 0`) from the on-disk file.



293
294
295
296
# File 'lib/rigor/analysis/baseline.rb', line 293

def without(buckets_to_drop)
  dropset = buckets_to_drop.to_set
  self.class.new(buckets.reject { |b| dropset.include?(b) }, project_root: @project_root)
end