Class: Rigor::Analysis::Baseline

Inherits:

Object

Object
Rigor::Analysis::Baseline

show all

Defined in:: lib/rigor/analysis/baseline.rb

Overview

ADR-22 Slice 1 — PHPStan-shaped per-project baseline.

Loads ‘.rigor-baseline.yml`, filters a current run’s diagnostic stream against the recorded buckets, and emits an ‘(surfaced, silenced_count)` pair for the CLI to render.

Two row shapes are accepted (WD1):

# rule-ID row — bucket key (path, qualified_rule)
- file: app/models/user.rb
  rule: call.undefined-method
  count: 3

# message-pattern row — bucket key
#   (path, qualified_rule, message_regex)
- file: app/lib/sig.rb
  rule: call.undefined-method
  message: "undefined method `merge' for Array"
  count: 1

## Semantics per (file, rule [, message]) bucket (WD4)

actual <= count    → ALL diagnostics in the bucket are silenced.
actual >  count    → ALL diagnostics in the bucket surface
                     (not just the excess delta — the bucket
                     has crossed its threshold; the team's
                     review focus shifts from "which N is new"
                     to "what's going on with this rule in
                     this file as a whole").

## Filter pipeline position (WD6)

The baseline filter runs LAST among the diagnostic-suppression layers:

emit →  `# rigor:disable` (per-line)
     →  `# rigor:disable-file`
     →  severity_profile re-stamp
     →  baseline filter (this class)
     →  output

## Loading (WD2 (b))

‘Baseline.load` is called by the CLI when it has resolved an explicit baseline path (from `–baseline=PATH` on the CLI or `baseline: <path>` in `.rigor.yml`). The presence of `.rigor-baseline.yml` on disk alone never triggers a load — that’s the CLI / Configuration’s job to enforce.

## Path handling

Baselines store file paths **relative to the project root** (the working directory when ‘rigor` is run). This makes the generated `.rigor-baseline.yml` portable across machines and checkout locations. When filtering a live diagnostic stream, the instance normalises each diagnostic’s absolute path to a relative one before the bucket lookup.

Defined Under Namespace

Classes: Bucket, DriftRow, LoadError

Constant Summary collapse

CURRENT_VERSION =

Instance Attribute Summary collapse

#buckets ⇒ Object readonly

Returns the value of attribute buckets.

Class Method Summary collapse

.from_diagnostics(diagnostics, match_mode: :rule, project_root: Dir.pwd) ⇒ Object

Build a baseline from a current run’s diagnostic stream.
.load(path, project_root: Dir.pwd) ⇒ Object

Load a baseline file from disk.

Instance Method Summary collapse

#audit(diagnostics) ⇒ Array<DriftRow>

Walk the current diagnostic stream and report bucket-level drift.
#empty? ⇒ Boolean
#filter(diagnostics) ⇒ Object

Apply the baseline filter to a diagnostic stream.
#initialize(buckets, project_root: Dir.pwd) ⇒ Baseline constructor

A new instance of Baseline.
#size ⇒ Object

The number of buckets recorded.
#to_yaml ⇒ Object

Serialise to a YAML string.
#without(buckets_to_drop) ⇒ Object

Returns a new Baseline with the given buckets dropped.

Constructor Details

#initialize(buckets, project_root: Dir.pwd) ⇒ `Baseline`

Returns a new instance of Baseline.

# File 'lib/rigor/analysis/baseline.rb', line 200

def initialize(buckets, project_root: Dir.pwd)
  @project_root = Pathname.new(project_root)
  @buckets = buckets.freeze
  # For each (file, qualified_rule) pair, two arrays:
  # - rule-ID rows (message_regex == nil)
  # - message-pattern rows (message_regex != nil)
  # The matcher walks message-pattern rows first (tighter
  # match takes precedence); diagnostics that don't match
  # any message row fall through to the rule-ID row if
  # one exists.
  @by_pair = buckets.group_by { |b| [b.file, b.rule] }.freeze
  freeze
end

Instance Attribute Details

#buckets ⇒ `Object` (readonly)

Returns the value of attribute buckets.



198
199
200

# File 'lib/rigor/analysis/baseline.rb', line 198

def buckets
  @buckets
end

Class Method Details

.from_diagnostics(diagnostics, match_mode: :rule, project_root: Dir.pwd) ⇒ `Object`

Build a baseline from a current run’s diagnostic stream. ‘match_mode:` is `:rule` (default) or `:message`. The message-mode generator passes literal messages through `Regexp.escape` so generated rows never accidentally over-match on punctuation.

‘project_root:` is used to convert absolute diagnostic paths to relative paths in the generated YAML. Defaults to `Dir.pwd`.

Raises:

(ArgumentError)

# File 'lib/rigor/analysis/baseline.rb', line 102

def from_diagnostics(diagnostics, match_mode: :rule, project_root: Dir.pwd)
  raise ArgumentError, "match_mode must be :rule or :message" unless %i[rule message].include?(match_mode)

  grouped = group_for_baseline(diagnostics, match_mode, project_root)
  buckets = grouped.map do |key, entries|
    Bucket.new(
      file: key[0],
      rule: key[1],
      message_regex: key[2],
      count: entries.size
    )
  end
  new(buckets, project_root: project_root)
end

.load(path, project_root: Dir.pwd) ⇒ `Object`

Load a baseline file from disk. Returns ‘nil` when the path is nil (the caller’s “no baseline configured” state). Raises LoadError on malformed content; callers translate to a user-facing diagnostic.

‘project_root:` is the working directory against which stored relative paths are resolved during filtering. Defaults to `Dir.pwd`.

# File 'lib/rigor/analysis/baseline.rb', line 85

def load(path, project_root: Dir.pwd)
  return nil if path.nil?
  return new([], project_root: project_root) unless File.exist?(path)

  raw = YAML.safe_load_file(path, permitted_classes: [Symbol])
  parse_loaded(raw, path: path, project_root: project_root)
end

Instance Method Details

#audit(diagnostics) ⇒ `Array<DriftRow>`

Walk the current diagnostic stream and report bucket-level drift. Each baseline bucket becomes one DriftRow regardless of whether the current run still matches it.

Parameters:

diagnostics (Array<Diagnostic>) —

current run’s diagnostic stream (PRE-filter — pass the raw ‘result.diagnostics` from `Runner#run`, not the post-baseline surface).

Returns:

(Array<DriftRow>) —

one entry per baseline bucket, in baseline-file order.

# File 'lib/rigor/analysis/baseline.rb', line 275

def audit(diagnostics)
  counts = Hash.new(0)
  diagnostics.each do |diag|
    next if diag.qualified_rule.nil? || diag.path.nil?

    bucket = claim_bucket_for(diag)
    counts[bucket_key(bucket)] += 1 if bucket
  end

  buckets.map do |bucket|
    actual = counts[bucket_key(bucket)]
    DriftRow.new(bucket: bucket, actual_count: actual, status: status_for(actual, bucket.count))
  end
end

#empty? ⇒ `Boolean`

Returns:

(Boolean)



320
321
322

# File 'lib/rigor/analysis/baseline.rb', line 320

def empty?
  buckets.empty?
end

#filter(diagnostics) ⇒ `Object`

Apply the baseline filter to a diagnostic stream.

Returns a 2-tuple:

‘surfaced` — the diagnostics that survived the filter (new findings + entire over-threshold buckets).
‘silenced_count` — how many diagnostics the baseline suppressed (for the WD7 stderr summary line).

# File 'lib/rigor/analysis/baseline.rb', line 221

def filter(diagnostics)
  return [diagnostics, 0] if buckets.empty?

  grouped = group_diagnostics_for_filtering(diagnostics)
  surfaced = []
  silenced_count = 0

  grouped.each_value do |entries|
    bucket = entries[:bucket]
    diags = entries[:diagnostics]
    # No matching bucket → all surface as new findings.
    # `actual <= count` → all silenced (within threshold,
    # WD4). `actual >  count` → all surface (over
    # threshold, WD4).
    if bucket && diags.size <= bucket.count
      silenced_count += diags.size
    else
      surfaced.concat(diags)
    end
  end

  # Diagnostics that lacked a rule or a path bypass the
  # baseline entirely (the baseline can't address them).
  unkeyable = diagnostics.reject { |d| d.qualified_rule && d.path }
  [surfaced + unkeyable, silenced_count]
end

#size ⇒ `Object`

The number of buckets recorded. Useful for the CLI summary on ‘generate`.



316
317
318

# File 'lib/rigor/analysis/baseline.rb', line 316

def size
  buckets.size
end

#to_yaml ⇒ `Object`

Serialise to a YAML string. The generator path writes this through ‘File.write`; the dump format is stable across versions of this class as long as the bucket shape is unchanged.

# File 'lib/rigor/analysis/baseline.rb', line 302

def to_yaml
  rows = buckets.map do |bucket|
    row = { "file" => bucket.file, "rule" => bucket.rule }
    row["message"] = bucket.message_regex.source if bucket.message_regex
    row["count"] = bucket.count
    row
  end

  document = { "version" => CURRENT_VERSION, "ignored" => rows }
  YAML.dump(document)
end

#without(buckets_to_drop) ⇒ `Object`

Returns a new Baseline with the given buckets dropped. Used by ‘rigor baseline prune` (slice 2) to remove cleared buckets (`actual == 0`) from the on-disk file.

# File 'lib/rigor/analysis/baseline.rb', line 293

def without(buckets_to_drop)
  dropset = buckets_to_drop.to_set
  self.class.new(buckets.reject { |b| dropset.include?(b) }, project_root: @project_root)
end

Class: Rigor::Analysis::Baseline

Overview

Defined Under Namespace

Constant Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(buckets, project_root: Dir.pwd) ⇒ Baseline

Instance Attribute Details

#buckets ⇒ Object (readonly)

Class Method Details

.from_diagnostics(diagnostics, match_mode: :rule, project_root: Dir.pwd) ⇒ Object

.load(path, project_root: Dir.pwd) ⇒ Object

Instance Method Details

#audit(diagnostics) ⇒ Array<DriftRow>

#empty? ⇒ Boolean

#filter(diagnostics) ⇒ Object

#size ⇒ Object

#to_yaml ⇒ Object

#without(buckets_to_drop) ⇒ Object