Class: Rigor::Analysis::Runner

Inherits:
Object
  • Object
show all
Defined in:
lib/rigor/analysis/runner.rb,
lib/rigor/analysis/runner/run_snapshots.rb,
lib/rigor/analysis/runner/pool_coordinator.rb,
lib/rigor/analysis/runner/project_pre_passes.rb,
lib/rigor/analysis/runner/diagnostic_aggregator.rb

Overview

rubocop:disable Metrics/ClassLength

Defined Under Namespace

Classes: DiagnosticAggregator, PoolCoordinator, ProjectPrePasses, RunSnapshots

Constant Summary collapse

RUBY_GLOB =
"**/*.rb"
DEFAULT_CACHE_ROOT =
".rigor/cache"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(configuration:, explain: false, cache_store: Cache::Store.new(root: DEFAULT_CACHE_ROOT), plugin_requirer: nil, workers: 0, collect_stats: true, buffer: nil, prebuilt: nil, environment: nil, record_dependencies: false, record_self_calls: false, analyze_only: nil) ⇒ Runner

Returns a new instance of Runner.

Parameters:

  • configuration (Rigor::Configuration)
  • explain (Boolean) (defaults to: false)

    surface fail-soft fallback events as ‘:info` diagnostics.

  • cache_store (Rigor::Cache::Store, nil) (defaults to: Cache::Store.new(root: DEFAULT_CACHE_ROOT))

    the persistent cache the runner exposes to producers (‘RbsConstantTable` and successors). Pass `nil` to disable caching for this run; the CLI’s ‘–no-cache` flag wires `nil` through. v0.0.9 group A slice 1 introduces the surface; later slices route real producers through it.

  • workers (Integer) (defaults to: 0)

    ADR-15 Phase 4b — when greater than zero, per-file analysis dispatches across a pool of N Ractor workers built around WorkerSession. Default ‘0` keeps the sequential code path bit-for-bit unchanged. Phase 4c will wire the CLI / `.rigor.yml` surface that produces non-zero values; this slice leaves the parameter as a programmatic opt-in only.

  • collect_stats (Boolean) (defaults to: true)

    when true (default), ‘#run` builds a Rigor::Analysis::RunStats summary exposed via `result.stats` — this forces the RBS env build at end-of-run so the `class_decl_paths` snapshot has real source attribution. Set to false to skip the stats summary entirely; the CLI’s ‘–no-stats` threads `false` through to keep trivial-fixture runs from warming `.rigor/cache`.

  • prebuilt (Rigor::Analysis::ProjectScan, nil) (defaults to: nil)

    when supplied, the runner adopts the pre-built plugin registry / dependency-source index / scanner outputs from the snapshot and skips the per-call pre-passes that produce them. Used by long-lived integrations (‘Rigor::LanguageServer::ProjectContext`) to keep per-buffer requests fast — scanners walk the project once per generation rather than once per request, and plugin `#prepare` runs once per generation rather than once per request. Watched-file invalidation is the owner’s responsibility; the runner trusts the snapshot it was given.

  • environment (Rigor::Environment, nil) (defaults to: nil)

    opt-in Environment override. When supplied, sequential mode uses the provided env instance in ‘#analyze_files` instead of building a fresh one via `Environment.for_project`, and attaches the runner’s per-run reporter pair onto the env’s mutable ‘Reporters` slot via `Environment#attach_reporters!`. Long-lived consumers (LSP `ProjectContext`) pass a shared env so per-publish work doesn’t repeat the ‘Environment.for_project` build (bundler / lockfile / collection discovery, RbsLoader construction). Pool mode ignores the override — each worker continues to build its own Environment.



97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/rigor/analysis/runner.rb', line 97

def initialize(configuration:, explain: false, # rubocop:disable Metrics/ParameterLists,Metrics/AbcSize,Metrics/MethodLength
               cache_store: Cache::Store.new(root: DEFAULT_CACHE_ROOT),
               plugin_requirer: nil, workers: 0, collect_stats: true,
               buffer: nil, prebuilt: nil, environment: nil,
               record_dependencies: false, record_self_calls: false, analyze_only: nil)
  @configuration = configuration
  @explain = explain
  @cache_store = enforce_read_only_cache(cache_store, buffer)
  @plugin_requirer = plugin_requirer
  @workers = workers
  @collect_stats = collect_stats
  @buffer = buffer
  @prebuilt = prebuilt
  @environment_override = environment
  # ADR-46 slice 1 — opt-in cross-file dependency recording. Off by
  # default; when true, `analyze_file` records each file's
  # cross-file reads into `file_dependencies` (the incremental
  # cache, a later slice, consumes them).
  @record_dependencies = record_dependencies
  # ADR-24 slice 4a — opt-in unresolved-implicit-self-call recording.
  # Off by default; when true, `analyze_file` activates the engine
  # choke-point recorder and collects each file's misses into
  # `unresolved_self_calls` (a later closed-class-gated rule consumes
  # them). Purely observational — diagnostics are byte-identical.
  @record_self_calls = record_self_calls
  @unresolved_self_calls = {}
  # Memoised activation decision for the `call.self-undefined-method`
  # rule (nil = not yet computed). See `self_undefined_rule_active?`.
  @self_undefined_rule_active = nil
  @analyzed_files = [].freeze
  # In-memory source map for `#run_source` — `{ logical_path => source
  # String }`. When set, `parse_source` reads bytes from here instead
  # of disk and `expand_paths` accepts the (possibly non-existent)
  # logical path. nil on a normal disk-backed run.
  @in_memory_sources = nil
  # ADR-46 slice 2 — the subset-analysis hook. When set (a collection
  # of paths), the whole-project pre-pass still runs over every file
  # (so the cross-file index is complete), but only files in this set
  # are analyzed for diagnostics — the body tier re-analyses the
  # affected closure and serves the rest from the per-file cache.
  # `nil` (the default) analyzes everything.
  @analyze_only = analyze_only && Set.new(analyze_only)
  @file_dependencies = {}
  @plugin_registry = Plugin::Registry::EMPTY
  @dependency_source_index = DependencySourceInference::Index::EMPTY
  @rbs_extended_reporter = RbsExtended::Reporter.new
  @boundary_cross_reporter = DependencySourceInference::BoundaryCrossReporter.new
  @source_rbs_synthesis_reporter = Plugin::SourceRbsSynthesisReporter.new
  # `#run` resets these for each invocation; pre-seed them to
  # empty containers so `build_run_stats` / `pre_file_diagnostics`
  # (private, called only from `#run`) can read them without
  # nil-guards. The four end-of-pass snapshots (RBS class /
  # signature-path tables, synthesized-namespace names,
  # `rigor:v1:conforms-to` results) live in one shared mutable
  # {RunSnapshots} sink so the analysis path that writes them and
  # the run / aggregator code that reads them stay in separate
  # collaborators without a back-reference cycle.
  @snapshots = RunSnapshots.new
  @cached_plugin_prepare_diagnostics = [].freeze
  @project_discovered_classes = {}.freeze
  @project_discovered_def_nodes = {}.freeze
  @project_discovered_def_sources = {}.freeze
  @project_discovered_superclasses = {}.freeze
  @project_discovered_includes = {}.freeze
  @project_discovered_class_sources = {}.freeze
  @project_discovered_method_visibilities = {}.freeze
  @project_discovered_methods = {}.freeze
  @project_data_member_layouts = {}.freeze
  build_collaborators
end

Instance Attribute Details

#analyzed_filesObject (readonly)

Returns the value of attribute analyzed_files.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def analyzed_files
  @analyzed_files
end

#boundary_cross_reporterObject (readonly)

Returns the value of attribute boundary_cross_reporter.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def boundary_cross_reporter
  @boundary_cross_reporter
end

#bufferObject (readonly)

ADR-pending editor mode — present when the runner is wired for the ‘–tmp-file` / `–instead-of` buffer-binding shape (`docs/design/20260516-editor-mode.md`). Nil for normal project runs.



172
173
174
# File 'lib/rigor/analysis/runner.rb', line 172

def buffer
  @buffer
end

#cache_storeObject (readonly)

Returns the value of attribute cache_store.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def cache_store
  @cache_store
end

#dependency_source_indexObject (readonly)

Returns the value of attribute dependency_source_index.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def dependency_source_index
  @dependency_source_index
end

#file_dependenciesObject (readonly)

Returns the value of attribute file_dependencies.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def file_dependencies
  @file_dependencies
end

#plugin_registryObject (readonly)

Returns the value of attribute plugin_registry.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def plugin_registry
  @plugin_registry
end

#rbs_extended_reporterObject (readonly)

Returns the value of attribute rbs_extended_reporter.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def rbs_extended_reporter
  @rbs_extended_reporter
end

#unresolved_self_callsObject (readonly)

Returns the value of attribute unresolved_self_calls.



46
47
48
# File 'lib/rigor/analysis/runner.rb', line 46

def unresolved_self_calls
  @unresolved_self_calls
end

Instance Method Details

#analysis_file_set(paths = @configuration.paths) ⇒ Object

ADR-46 — the project file set that a run over ‘paths` would analyze, computed by globbing only (no RBS environment build), so the incremental fingerprint can be derived cheaply on the warm path before deciding whether to build the env at all.



232
233
234
# File 'lib/rigor/analysis/runner.rb', line 232

def analysis_file_set(paths = @configuration.paths)
  expand_paths(paths).fetch(:files)
end

#analyzed_file_entries(expansion) ⇒ Object



393
394
395
396
397
398
399
400
# File 'lib/rigor/analysis/runner.rb', line 393

def analyzed_file_entries(expansion)
  expansion.fetch(:files).map do |path|
    physical = @buffer ? @buffer.resolve(path) : path
    Cache::Descriptor::FileEntry.new(
      path: physical, comparator: :digest, value: Digest::SHA256.file(physical).hexdigest
    )
  end
end

#assemble_run_diagnostics(expansion, environment: nil) ⇒ Object



321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
# File 'lib/rigor/analysis/runner.rb', line 321

def assemble_run_diagnostics(expansion, environment: nil)
  diagnostics = @diagnostic_aggregator.pre_file_diagnostics(expansion)
  # ADR-46 — record which project files this run actually analyzed
  # (the `analyze_only` subset, or all of them). The incremental
  # orchestrator serves every analyzed-but-not-affected file from the
  # per-file cache, so it needs the full analyzed set to subtract the
  # affected closure from.
  targets = target_files(expansion)
  @analyzed_files = targets
  diagnostics += @pool_coordinator.analyze_files(targets, environment: environment)
  diagnostics += @diagnostic_aggregator.rbs_synthesized_namespace_diagnostics
  diagnostics += @diagnostic_aggregator.conforms_to_diagnostics
  diagnostics += @diagnostic_aggregator.rbs_extended_reporter_diagnostics
  diagnostics += @diagnostic_aggregator.boundary_cross_diagnostics
  diagnostics + @diagnostic_aggregator.source_rbs_synthesis_diagnostics
end

#class_declarationsObject

ADR-46 slice 3 — per-file set of the qualified class/module names declared in that file. Used to detect a class that appeared in an edit so a subclass whose ancestor was previously undefined (and so recorded a negative class edge) is re-checked. Inverts the project class-source attribution (class → declaring files).



279
280
281
282
283
284
285
# File 'lib/rigor/analysis/runner.rb', line 279

def class_declarations
  result = Hash.new { |hash, key| hash[key] = Set.new }
  @project_discovered_class_sources.each do |class_name, files|
    files.each { |file| result[file] << class_name }
  end
  result.transform_values(&:freeze).freeze
end

#compute_run_diagnostics(expansion) ⇒ Object

ADR-45 — unchanged-project fast path. Serves the whole run’s (pre-severity-profile) diagnostics from one record-and-validate cache entry when every file the previous run read is unchanged, skipping the dominant per-file inference. The dependency set is collected AFTER the run (so it captures files the plugins read mid-analysis, e.g. a Pundit policy) and re-validated on the next run; the entry is keyed on the inputs known up front (config, gem / engine versions, analyzed-path set).



295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
# File 'lib/rigor/analysis/runner.rb', line 295

def compute_run_diagnostics(expansion)
  @run_served_from_cache = false
  return assemble_run_diagnostics(expansion) unless run_result_cacheable?

  environment = @pool_coordinator.resolve_sequential_environment(source_files: target_files(expansion))
  rbs_descriptor = environment&.rbs_loader ? Cache::RbsDescriptor.build(environment.rbs_loader) : Cache::Descriptor.new
  key_descriptor = run_key_descriptor(expansion, rbs_descriptor)
  return assemble_run_diagnostics(expansion, environment: environment) if key_descriptor.nil?

  computed = false
  diagnostics = @cache_store.fetch_or_validate(
    producer_id: "analysis.run-diagnostics", key_descriptor: key_descriptor
  ) do
    computed = true
    diags = assemble_run_diagnostics(expansion, environment: environment)
    [diags, run_dependency_descriptor(expansion, rbs_descriptor)]
  end
  @run_served_from_cache = !computed
  diagnostics
rescue StandardError
  # The result cache must never break a run. If anything in the
  # cache path fails, fall back to a direct, uncached analysis.
  @run_served_from_cache = false
  assemble_run_diagnostics(expansion)
end

#config_hash_entry(key, payload) ⇒ Object



402
403
404
# File 'lib/rigor/analysis/runner.rb', line 402

def config_hash_entry(key, payload)
  Cache::Descriptor::ConfigEntry.new(key: key, value_hash: Digest::SHA256.hexdigest(payload))
end

#file_dependentsObject

ADR-46 §2 — inverts #file_dependencies into the reverse edge the incremental step walks: ‘dependents = { A : A read a declaration / body from X }`. On an edit to X, the body tier (slice 2) re-analyses `X ∪ dependents` and serves every other file from the per-file cache. Built on demand from the recorded `sources` sets (so it reflects whatever `analyze_file` captured —empty unless the runner was constructed with `record_dependencies: true`). The negative (`missing`) edges are NOT inverted here: they feed the structural tier (slice 3), which re-checks a consumer when a name it looked up and did not resolve later appears.



247
248
249
# File 'lib/rigor/analysis/runner.rb', line 247

def file_dependents
  Incremental.invert(@file_dependencies.transform_values(&:sources))
end

#prepare_project_scan(paths: @configuration.paths) ⇒ Object

Runs every project-wide pre-pass (‘load_plugins` + `plugin#prepare` + dependency-source builder + synthetic-method scanner + project-patched scanner) exactly once, then returns a frozen ProjectScan snapshot.

Long-lived integrations (‘Rigor::LanguageServer::ProjectContext`) call this once per project-state generation and feed the snapshot back into `Runner.new(prebuilt: scan)` for every subsequent per-buffer publish. The cold pre-pass cost is paid once per generation rather than once per keystroke.

Notes for callers:

  • The runner this method is called on may be a “build only” instance — ‘@buffer` is typically nil so the scanners observe on-disk bytes for the full project. Callers that want pre-passes to see a particular buffer’s edits should build the runner WITH ‘buffer:` set.

  • The returned ProjectScan is frozen and shareable; the underlying ‘plugin_registry` is the same object that ran `#prepare`, so the per-plugin `services.fact_store` is already populated for subsequent dispatch use.



428
429
430
431
432
433
# File 'lib/rigor/analysis/runner.rb', line 428

def prepare_project_scan(paths: @configuration.paths)
  expansion = expand_paths(paths)
  result = @pre_passes.run(expansion: expansion)
  apply_pre_passes_result(result)
  @pre_passes.build_project_scan(result)
end

#run(paths = @configuration.paths) ⇒ Object

Walks every Ruby file under ‘paths`, parses it, builds a per-node scope index through `Rigor::Inference::ScopeIndexer`, and runs the `Rigor::Analysis::CheckRules` catalogue over it. Returns a `Rigor::Analysis::Result` aggregating every produced diagnostic plus any Prism parse errors. The Environment is built once at run start through `Environment.for_project` so all files share the same RBS load.



182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
# File 'lib/rigor/analysis/runner.rb', line 182

def run(paths = @configuration.paths)
  Inference::MethodDispatcher::FileFolding.fold_platform_specific_paths =
    @configuration.fold_platform_specific_paths

  wall_started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)

  target_ruby_error = validate_target_ruby
  return Result.new(diagnostics: [target_ruby_error]) if target_ruby_error

  expansion = expand_paths(paths)
  @snapshots.reset_for_run

  if @prebuilt
    adopt_prebuilt_project_scan(@prebuilt)
  else
    run_project_pre_passes(expansion: expansion)
  end

  diagnostics = compute_run_diagnostics(expansion)

  Result.new(
    diagnostics: @diagnostic_aggregator.apply_severity_profile(diagnostics),
    stats: stats_for_run(wall_started_at: wall_started_at, expansion: expansion)
  )
end

#run_dependency_descriptor(expansion, rbs_descriptor) ⇒ Object

Files the run actually depended on, collected AFTER it ran: every analyzed file, every RBS ‘sig` file (`rbs_descriptor.files`), and every file each plugin read (complete post-run, so reads made mid-analysis are included). Re-digested on the next run by Descriptor#fresh?.



380
381
382
383
384
385
386
387
388
389
390
391
# File 'lib/rigor/analysis/runner.rb', line 380

def run_dependency_descriptor(expansion, rbs_descriptor)
  entries = analyzed_file_entries(expansion) + rbs_descriptor.files
  @plugin_registry.plugins.each do |plugin|
    # Read the boundary WITHOUT triggering its lazy `@io_boundary ||=`
    # initializer: plugin instances are frozen after the run, and a
    # plugin that never built a boundary read no files through it, so
    # it contributes no dependencies.
    boundary = plugin.instance_variable_get(:@io_boundary)
    entries.concat(boundary.cache_descriptor.files) if boundary
  end
  Cache::Descriptor.new(files: entries)
end

#run_key_descriptor(expansion, rbs_descriptor) ⇒ Object

Stable cache key inputs — known before the run: a digest of the resolved configuration, the engine + rbs versions + ‘–explain`, and the analyzed-path SET (adding/removing a file changes the key; editing one is caught by dependency validation). nil disables the cache for this run rather than risking a malformed key.



362
363
364
365
366
367
368
369
370
371
372
373
# File 'lib/rigor/analysis/runner.rb', line 362

def run_key_descriptor(expansion, rbs_descriptor)
  Cache::Descriptor.new(
    gems: rbs_descriptor.gems,
    configs: rbs_descriptor.configs + [
      config_hash_entry("configuration", Marshal.dump(@configuration.to_h)),
      config_hash_entry("engine", "#{Rigor::VERSION}:#{Cache::Descriptor::SCHEMA_VERSION}:#{@explain}"),
      config_hash_entry("paths", expansion.fetch(:files).sort.join("\n"))
    ]
  )
rescue StandardError
  nil
end

#run_result_cacheable?Boolean

Cacheable only for a full sequential project run with a writable cache and no per-buffer / prebuilt override — every other mode has a different result identity (pool workers read in separate processes; editor mode is per-buffer; prebuilt is the LSP path).

Returns:

  • (Boolean)


352
353
354
355
# File 'lib/rigor/analysis/runner.rb', line 352

def run_result_cacheable?
  !@cache_store.nil? && !@cache_store.read_only? &&
    @buffer.nil? && @prebuilt.nil? && !pool_mode?
end

#run_source(source:, path: "(source).rb") ⇒ Result

Analyze a single source String in memory, without writing it to disk — a clean entry point for embedders (LSP / editor mode) and a faster spec path than the per-call tmpdir + chdir. The source is bound to ‘path` (purely a logical identity carried in diagnostic locations; it need not exist on disk). The full run machinery still runs — environment build, plugin `prepare`, severity profile — so the result matches a one-file disk run; only the cross-file project pre-pass is empty (there is one file, and the per-file indexer self-discovers its own classes / defs).

Parameters:

  • source (String)

    Ruby source to analyze.

  • path (String) (defaults to: "(source).rb")

    logical path for diagnostic locations.

Returns:



221
222
223
224
225
226
# File 'lib/rigor/analysis/runner.rb', line 221

def run_source(source:, path: "(source).rb")
  @in_memory_sources = { path => source }
  run([path])
ensure
  @in_memory_sources = nil
end

#stats_for_run(wall_started_at:, expansion:) ⇒ Object

A cache hit skipped the analysis, so the per-run stats (wall split, RBS-class counts, …) were never gathered — report none rather than the stale snapshot defaults.



341
342
343
344
345
346
# File 'lib/rigor/analysis/runner.rb', line 341

def stats_for_run(wall_started_at:, expansion:)
  return nil unless @collect_stats
  return nil if @run_served_from_cache

  build_run_stats(wall_started_at: wall_started_at, expansion: expansion)
end

#symbol_fingerprintsObject

ADR-46 slice 4 — per-symbol body fingerprints, computed from the project pre-pass def index. Returns a frozen hash of the form:

{ "path/to/file.rb" => { "ClassName#method" => sha256_hex, … }, … }

Used by IncrementalSession to detect which symbols in a changed file actually changed bodies, so only callers of those specific symbols are re-checked. Only meaningful after a run that populated ‘@project_discovered_def_nodes` (i.e. any full or subset analysis); returns an empty frozen hash before the first run.



259
260
261
262
263
264
265
266
267
268
269
270
271
272
# File 'lib/rigor/analysis/runner.rb', line 259

def symbol_fingerprints
  result = Hash.new { |h, k| h[k] = {} }
  @project_discovered_def_sources.each do |class_name, methods|
    methods.each do |method_sym, path_line|
      path = path_line.split(":", 2).first
      node = @project_discovered_def_nodes.dig(class_name, method_sym)
      next unless node

      result[path]["#{class_name}##{method_sym}"] =
        Digest::SHA256.hexdigest(node.location.slice)
    end
  end
  result.transform_values(&:freeze).freeze
end

#validate_target_rubyObject

‘target_ruby` flows through to Prism’s ‘version:` option. Prism enforces the supported range and raises `ArgumentError` for versions it does not recognise. Run a one-time smoke parse here so a misconfigured target_ruby surfaces as a single project-level diagnostic instead of crashing the whole run on the first file.



487
488
489
490
491
492
493
494
495
496
497
498
# File 'lib/rigor/analysis/runner.rb', line 487

def validate_target_ruby
  Prism.parse("nil", version: @configuration.target_ruby)
  nil
rescue ArgumentError => e
  Diagnostic.new(
    path: ".rigor.yml", line: 1, column: 1,
    message: "target_ruby #{@configuration.target_ruby.inspect} is not accepted by Prism: #{e.message}",
    severity: :error,
    rule: "configuration-error",
    source_family: :builtin
  )
end