Class: Guardrails::CrossCodebasePatterns

Inherits:
Object
  • Object
show all
Defined in:
lib/guardrails/cross_codebase_patterns.rb

Overview

Finds recurring structural patterns across the codebase — element subtrees that appear in 3+ places and could be extracted into a shared partial or ViewComponent.

Distinct from PartialSimilarity: that one compares EXISTING partials against each other (“are these two partials near-duplicates?”). CrossCodebasePatterns looks at the structural shape of any subtree in any view (“this 8-element shape appears in 12 places, only one of which is a partial — refactor candidate”).

Defined Under Namespace

Classes: Occurrence, Pattern

Constant Summary collapse

DEFAULT_MIN_SIZE =

Minimum number of element nodes in a subtree before we consider it. Below this, the structural shape is too generic to be a refactor candidate — ‘<div>` alone, or `<span><a></a></span>`, would match constantly. A useful pattern starts around 5 elements (card body, form row, table cell with controls, etc.).

5
DEFAULT_MIN_OCCURRENCES =

Subtree fingerprint must appear at least this many times to surface. 2 occurrences are common and rarely actionable; 3+ implies a real repeated shape.

3
DEFAULT_MAX_OCCURRENCES_SHOWN =

Max occurrences printed per pattern before we elide the rest.

10
VIEW_PATTERNS =
[
  "app/views/**/*.html.erb",
  "app/components/**/*.html.erb"
].freeze
IMPLICIT_IGNORE_SEGMENTS =
%w[vendor node_modules tmp public log].freeze
IMPLICIT_IGNORE_PATTERNS =
[/\A(?:\w+_)?mailer\z/].freeze

Instance Method Summary collapse

Constructor Details

#initialize(root:, output: $stdout, min_size: DEFAULT_MIN_SIZE, min_occurrences: DEFAULT_MIN_OCCURRENCES, max_occurrences_shown: DEFAULT_MAX_OCCURRENCES_SHOWN) ⇒ CrossCodebasePatterns

Returns a new instance of CrossCodebasePatterns.



50
51
52
53
54
55
56
57
58
59
# File 'lib/guardrails/cross_codebase_patterns.rb', line 50

def initialize(root:, output: $stdout,
               min_size: DEFAULT_MIN_SIZE,
               min_occurrences: DEFAULT_MIN_OCCURRENCES,
               max_occurrences_shown: DEFAULT_MAX_OCCURRENCES_SHOWN)
  @root = Pathname(root)
  @output = output
  @min_size = min_size
  @min_occurrences = min_occurrences
  @max_occurrences_shown = max_occurrences_shown
end

Instance Method Details

#find_patternsObject



67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/guardrails/cross_codebase_patterns.rb', line 67

def find_patterns
  occurrences = Hash.new { |h, k| h[k] = [] }
  shapes = {}

  view_files.each do |file|
    content = File.read(file, encoding: Encoding::UTF_8)
    result = ErbParser.parse(content)
    relative = file.relative_path_from(@root).to_s

    walk_subtrees(result.document) do |node, fingerprint, shape, size|
      next if size < @min_size

      line, column = ErbParser.start_position(node)
      occurrences[fingerprint] << Occurrence.new(
        file: relative,
        line: line,
        column: column,
        size: size
      )
      shapes[fingerprint] ||= shape
    end
  end

  patterns = occurrences
             .select { |_, occs| occs.size >= @min_occurrences }
             .map { |fp, occs| Pattern.new(fingerprint: fp, shape: shapes[fp], size: occs.first.size, occurrences: occs) }
             .sort_by { |p| [-p.count, -p.size] }

  dedupe_nested(patterns)
end

#runObject



61
62
63
64
65
# File 'lib/guardrails/cross_codebase_patterns.rb', line 61

def run
  patterns = find_patterns
  print_report(patterns)
  patterns
end