Module: Moult::Clones

Defined in:
lib/moult/clones.rb

Overview

The structural-clone detector — Moult's adapter over the flay gem and the only file that names Flay. Everything downstream consumes the Moult-owned Result value object, never a flay type, so the backend is swappable (the "swap, not rewrite" invariant). This is the duplication-slice analogue of Index (rubydex) and Coverage (SimpleCov/stdlib).

flay reports the largest duplicated S-expression node, grouping structurally equivalent code (literal values, variable/method/class names and whitespace are all ignored when hashing). Two distinctions it draws map onto our confidence grade:

  • bonus truthy => the nodes are byte-for-byte IDENTICAL (names and all) — the clearest copy-paste signal. We surface this as kind: :identical.
  • bonus nil => structurally SIMILAR (same shape, differing names/literals) — real duplication but weaker (could be parallel-by-design). kind: :similar.

As of flay 2.14 the default parser is Flay::NotRubyParser, which parses with Prism (the same parser Moult uses); no parallel parser stack is pulled in.

Defined Under Namespace

Classes: CloneSet, Occurrence, Result

Constant Summary collapse

DEFAULT_MIN_MASS =

flay's own default mass threshold; small enough to catch a duplicated method, large enough to skip incidental structural rhymes.

16

Class Method Summary collapse

Class Method Details

.backend_versionObject



87
88
89
# File 'lib/moult/clones.rb', line 87

def backend_version
  defined?(Flay::VERSION) ? Flay::VERSION : nil
end

.clone_set(item, root) ⇒ Object



70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# File 'lib/moult/clones.rb', line 70

def clone_set(item, root)
  occurrences = item.locations.map do |loc|
    Occurrence.new(
      path: SymbolId.relative_path(loc.file, root),
      line: loc.line,
      fuzzy: !loc.fuzzy.nil?
    )
  end
  CloneSet.new(
    structural_hash: item.structural_hash,
    node_type: item.name.to_s,
    kind: item.bonus ? :identical : :similar,
    mass: item.mass,
    occurrences: occurrences
  )
end

.detect(root:, files:, min_mass: DEFAULT_MIN_MASS, fuzzy: false) ⇒ Result

Parameters:

  • root (String)

    absolute analysis root (occurrence paths are relative to it)

  • files (Array<String>)

    absolute Ruby file paths to scan

  • min_mass (Integer) (defaults to: DEFAULT_MIN_MASS)

    flay's mass threshold; smaller fragments are ignored

  • fuzzy (Boolean) (defaults to: false)

    also report near-matches (off by default: deterministic)

Returns:



47
48
49
50
51
52
53
54
55
56
# File 'lib/moult/clones.rb', line 47

def detect(root:, files:, min_mass: DEFAULT_MIN_MASS, fuzzy: false)
  sets = files.empty? ? [] : run_flay(files, min_mass, fuzzy).filter_map { |item| clone_set(item, root) }
  Result.new(
    sets: sets,
    backend: "flay",
    backend_version: backend_version,
    min_mass: min_mass,
    fuzzy: fuzzy
  )
end

.run_flay(files, min_mass, fuzzy) ⇒ Object



62
63
64
65
66
67
68
# File 'lib/moult/clones.rb', line 62

def run_flay(files, min_mass, fuzzy)
  flay = Flay.new(Flay.default_options.merge(mass: min_mass, fuzzy: fuzzy))
  flay.process(*files)
  flay.analyze
rescue => e
  raise Moult::Error, "flay duplication scan failed: #{e.class}: #{e.message}"
end